Data Engineering

Forum Posts

Sorted by:

by d_meaker • New Contributor II

04-04-2023 3:11:03 AM

3161 Views
3 replies
0 kudos

map_keys() returns an empty array in Delta Live Table pipeline.

We are exploding a map type column into multiple columns based on the keys of the map column. Part of this process is to extract the keys of a map type column called json_map as illustrated in the snippet below. The code executes as expected when run...

Data Engineering

3161 Views
3 replies
0 kudos

04-04-2023 3:11:03 AM

View Replies

Latest Reply

d_meaker
New Contributor II

04-17-2023 6:13:55 AM

0 kudos

Hi @Suteja Kanuri , Thank you for you response and explanation. The code I have shown above is not the exact snippet we are using. Please find the exact snippet below. We are dynamically extracting the keys of the map and then using getitem() to mak...

0 kudos

04-17-2023 6:13:55 AM

2 More Replies

by Neerajkirola • New Contributor

04-17-2023 6:13:21 AM

4795 Views
0 replies
0 kudos

Types of RAM: An In-Depth OverviewRandom Access Memory (RAM) is an essential component of any computer system, responsible for temporarily storing dat...

Types of RAM: An In-Depth OverviewRandom Access Memory (RAM) is an essential component of any computer system, responsible for temporarily storing data that the CPU (Central Processing Unit) needs to access quickly. It allows for faster data retrieva...

Data Engineering

4795 Views
0 replies
0 kudos

04-17-2023 6:13:21 AM

by burhanudinera20 • New Contributor II

04-14-2023 11:53:23 PM

16971 Views
3 replies
0 kudos

Cannot import name 'Test' from partially initialized module 'databricks_test_helper'

I have done install, with this command ' pip install databricks_test_helper 'next get ImportError messages when i try running this code on cloud databricks ;from databricks_test_helper import *expected = set([(s, 'double') for s in ('AP', 'AT', 'PE'...

Data Engineering

16971 Views
3 replies
0 kudos

04-14-2023 11:53:23 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-16-2023 12:17:40 AM

0 kudos

@Burhanudin Badiuzaman :The error message suggests that there may be a circular import happening within the databricks_test_helper module, which is preventing the Test class from being properly imported.One possible solution is to import the Test cl...

0 kudos

04-16-2023 12:17:40 AM

2 More Replies

by rsamant07 • New Contributor III

03-22-2023 9:12:35 AM

8626 Views
11 replies
2 kudos

Resolved! DBT Job Type Authenticating to Azure Devops for git_source

we are trying to execute the databricks jobs for dbt task type but it is failing to autheticate to git. Problem is job is created using service principal but service principal don't seem to have access to the repo. few questions we have:1) can we giv...

Data Engineering

8626 Views
11 replies
2 kudos

03-22-2023 9:12:35 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-29-2023 10:42:16 PM

2 kudos

Hi @Rahul Samant I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest p...

2 kudos

03-29-2023 10:42:16 PM

10 More Replies

by sensanjoy • Contributor II

04-10-2023 5:52:57 AM

11918 Views
7 replies
3 kudos

Authenticate Databricks REST API and access delta tables from external web service.

Hi All,We do have a requirement to access delta tables from external web service(Web UI). Presently we have tested it through jdbc connection and authenticated using PAT:Ex. jdbc:spark://[DATABRICKS_HOST]:443/default;transportMode=http;ssl=1;httpPath...

Data Engineering

11918 Views
7 replies
3 kudos

04-10-2023 5:52:57 AM

View Replies

Latest Reply

sensanjoy
Contributor II

04-12-2023 9:42:08 AM

3 kudos

Hi @Suteja Kanuri , could you please help me with above queries.

3 kudos

04-12-2023 9:42:08 AM

6 More Replies

by Anonymous • Not applicable

04-17-2023 5:44:39 AM

7577 Views
0 replies
0 kudos

As companies grow and evolve, a Chief Technology Officer (CTO) becomes crucial in shaping the organization's technical direction and driving innov...

As companies grow and evolve, a Chief Technology Officer (CTO) becomes crucial in shaping the organization's technical direction and driving innovation. Regarding filling this critical leadership position, companies decide to either promote an existi...

Data Engineering

7577 Views
0 replies
0 kudos

04-17-2023 5:44:39 AM

by Pien • New Contributor II

04-12-2023 12:16:12 AM

13690 Views
5 replies
0 kudos

Resolved! Getting date out of year and week

Hi all,I'm trying to get a date out of the columns year and week. The week format is not recognized. df_loaded = df_loaded.withColumn("week_year", F.concat(F.lit("3"),F.col('Week'), F.col('Jaar')))df_loaded = df_loaded.withColumn("date", F.to_date(F...

Data Engineering

13690 Views
5 replies
0 kudos

04-12-2023 12:16:12 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-13-2023 1:57:56 AM

0 kudos

Hi @Pien Derkx Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

0 kudos

04-13-2023 1:57:56 AM

4 More Replies

by QuicKick • New Contributor

02-13-2023 10:01:21 AM

12944 Views
2 replies
0 kudos

How do I search for all the columns/field names starting with "XYZ"

I would like to do a big search on all field/columns names that contain "XYZ".I tried below sql but it's giving me an error.SELECT table_name,column_nameFROM information_schema.columnsWHERE column_name like '%<account>%'order by table_name, column_na...

Data Engineering

12944 Views
2 replies
0 kudos

02-13-2023 10:01:21 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-17-2023 2:55:21 AM

0 kudos

Hi @Ian Fox Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your ...

0 kudos

04-17-2023 2:55:21 AM

1 More Replies

by kaileena • New Contributor

04-14-2023 3:19:30 AM

2129 Views
2 replies
0 kudos

cannot install RMySQL "there is no package called ‘RMySQL’

cannot install RMySQL on databricks. i tried:install.packages("RMySQL")i got the error:Installing package into ‘/local_disk0/.ephemeral_nfs/envs/rEnv-c677bc4c-e6a3-40df-a5ab-bfd5d277e0c0’ (as ‘lib’ is unspecified) Warning: unable to access index for ...

Data Engineering

2129 Views
2 replies
0 kudos

04-14-2023 3:19:30 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-17-2023 2:34:14 AM

0 kudos

Hi @miru miro Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

0 kudos

04-17-2023 2:34:14 AM

1 More Replies

by Merchiv • New Contributor III

04-14-2023 12:13:55 AM

9834 Views
4 replies
0 kudos

Difference between Databricks and local pyspark split.

I have noticed some inconsistent behavior between calling the 'split' fuction on databricks and on my local installation.Running it in a databricks notebook givesspark.sql("SELECT split('abc', ''), size(split('abc',''))").show()So the string is split...

Data Engineering

9834 Views
4 replies
0 kudos

04-14-2023 12:13:55 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-16-2023 12:26:44 AM

0 kudos

@Ivo Merchiers :The behavior you are seeing is likely due to differences in the underlying version of Apache Spark between your local installation and Databricks. split() is a function provided by Spark's SQL functions, and different versions of Spa...

0 kudos

04-16-2023 12:26:44 AM

3 More Replies

by arw1070 • New Contributor II

04-05-2023 10:36:23 AM

3695 Views
3 replies
0 kudos

Databricks extension is not configuring in VScode

I am trying to install and work with the Databricks vscode extensions. I installed it a few weeks ago, and it initially worked, but I mistyped some of the configuration so I tried to restart, since then it has not worked. Whenever I install the exten...

Data Engineering

3695 Views
3 replies
0 kudos

04-05-2023 10:36:23 AM

View Replies

Latest Reply

karthik_p
Esteemed Contributor

04-05-2023 12:34:15 PM

0 kudos

@Anna Wuest I have Tried and not seeing any issues, which version of Vs code you are using. can you please try to update to latest Visual Studio Code version 1.77.1 and try to Install databricks plugin version and test .if you using windows--> pleas...

0 kudos

04-05-2023 12:34:15 PM

2 More Replies

by GuMart • New Contributor III

04-13-2023 12:31:12 AM

3424 Views
2 replies
1 kudos

Delta Live Tables - RETRY_ON_FAILURE

Hi,Is it possible to set it up the RETRY_ON_FAILURE property for DLTs through the API?I'm not finding in the Docs (although it seems to exist in a response payload).https://docs.databricks.com/delta-live-tables/api-guide.html

Data Engineering

3424 Views
2 replies
1 kudos

04-13-2023 12:31:12 AM

View Replies

Latest Reply

GuMart
New Contributor III

04-16-2023 10:58:19 PM

1 kudos

Hi @Suteja Kanuri ,Thank you so much for the quick and complete answer!Regards,

1 kudos

04-16-2023 10:58:19 PM

1 More Replies

by alm • New Contributor III

04-11-2023 4:51:59 AM

8613 Views
2 replies
2 kudos

Resolved! Vectorized reading of parquet file containing decimal type column(s)

I was trying to read a parquet file, and write to a delta table, with a parquet file that contains decimal type columns. I encountered a problem that is pretty neatly described by this kb.databricks article, and which I solved by disabling the vector...

Data Engineering

8613 Views
2 replies
2 kudos

04-11-2023 4:51:59 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-15-2023 6:07:25 PM

2 kudos

@Alberte Mørk :The behavior you observed is due to a known issue in Apache Spark when vectorized reading is used with Parquet files that contain decimal type columns. As you mentioned, the issue can be resolved by disabling vectorized reading for th...

2 kudos

04-15-2023 6:07:25 PM

1 More Replies

by Anonymous • Not applicable

04-10-2023 6:21:27 PM

2905 Views
2 replies
2 kudos

Hello Everyone, I'm interested to learn about the certifications you're pursuing to enhance your skills. Sharing your goals can inspire those ...

Hello Everyone,I'm interested to learn about the certifications you're pursuing to enhance your skills. Sharing your goals can inspire those who may have started their certification journey but struggled with motivation. Personally, I recently comple...

Data Engineering

2905 Views
2 replies
2 kudos

04-10-2023 6:21:27 PM

View Replies

Latest Reply

FJ
Contributor III

04-16-2023 7:32:24 PM

2 kudos

I'm trying the Data Engineering professional exam at the end of the month. It's like a shot in the dark because no practice exams stop are available and from what I've seen online from people who already passed it, the Advanced Data Engineering with ...

2 kudos

04-16-2023 7:32:24 PM

1 More Replies

by Anonymous • Not applicable

04-13-2023 11:57:55 AM

10992 Views
8 replies
0 kudos

Not able to connect to On-Prem Oracle from Databricks cluster

Hi Everyone,I was trying to connect to Oracle Instance from Databricks cluster and it is giving below error:java.sql.SQLTimeoutException: ORA-12170: Cannot connect. TCP connect timeout of 30000ms for host xx.x.x.*** port 1521. (CONNECTION_ID=CgM7V7UB...

Data Engineering

10992 Views
8 replies
0 kudos

04-13-2023 11:57:55 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-16-2023 12:29:47 AM

0 kudos

@Satya89:The error message you received indicates that the TCP connection to the Oracle database timed out. This could be caused by a number of factors such as network issues, firewall restrictions, or the database being overloaded.Here are a few ste...

0 kudos

04-16-2023 12:29:47 AM

7 More Replies

Databricks Community

Forum Posts

map_keys() returns an empty array in Delta Live Table pipeline.

Types of RAM: An In-Depth OverviewRandom Access Memory (RAM) is an essential component of any computer system, responsible for temporarily storing dat...

Cannot import name 'Test' from partially initialized module 'databricks_test_helper'

Resolved! DBT Job Type Authenticating to Azure Devops for git_source

Authenticate Databricks REST API and access delta tables from external web service.

As companies grow and evolve, a Chief Technology Officer (CTO) becomes crucial in shaping the organization's technical direction and driving innov...

Resolved! Getting date out of year and week

How do I search for all the columns/field names starting with "XYZ"

cannot install RMySQL "there is no package called ‘RMySQL’

Difference between Databricks and local pyspark split.

Databricks extension is not configuring in VScode

Delta Live Tables - RETRY_ON_FAILURE

Resolved! Vectorized reading of parquet file containing decimal type column(s)

Hello Everyone, I'm interested to learn about the certifications you're pursuing to enhance your skills. Sharing your goals can inspire those ...

Not able to connect to On-Prem Oracle from Databricks cluster

Join Us as a Local Community Builder!

Databricks external table lagging behind source fi...

Streamed DLT Pipeline using a lookup table

Delta live tables - foreign keys

Inconsistent behaviour when using read_files to re...

SQL Warehouse - Table does not support overwrite b...