Data Engineering

Forum Posts

Sorted by:

by Priyag1 • Honored Contributor II

05-25-2023 9:11:57 PM

2457 Views
7 replies
3 kudos

Resolved! Community Reward points

Hi Data bricks team , It is informed that all the lifetime points on community will be retired by June 2023. Is it the same for brightsites reward store also .Will the brightsites get restored or lost along with community life time points ? @Vidula...

Data Engineering

2457 Views
7 replies
3 kudos

05-25-2023 9:11:57 PM

View Replies

Latest Reply

Sujitha
Community Manager

05-25-2023 10:01:22 PM

3 kudos

@Priyadarshini G So, we are updating our community platform to provide a better experience for our users and members. As part of the updates, we plan on introducing a new recognition protocol that will better celebrate members for their contribution...

3 kudos

05-25-2023 10:01:22 PM

6 More Replies

by digui • New Contributor

07-15-2022 1:58:57 PM

1886 Views
3 replies
0 kudos

Issues when trying to modify log4j.properties

Hi y'all.I'm trying to export metrics and logs to AWS cloudwatch, but while following their tutorial to do so, I ended up facing this error when trying to initialize my cluster with an init script they provided.This is the part where the script fail...

Data Engineering

1886 Views
3 replies
0 kudos

07-15-2022 1:58:57 PM

View Replies

Latest Reply

atulec016
New Contributor II

05-26-2023 4:17:50 AM

0 kudos

I am facing the same issue . error details|+ tree /home/ubuntu/databricks/spark/dbconf | |+ cat /home/ubuntu/databricks/spark/dbconf/log4j/driver/log4j2.xml | |+ sed -i '/^log4j.appender.publicFile.layout/ s/^/#/g' /home/ubuntu/databricks/spark/dbcon...

0 kudos

05-26-2023 4:17:50 AM

2 More Replies

by Satty • New Contributor

05-25-2023 2:57:30 AM

1503 Views
1 replies
0 kudos

Solution for ConnectException error: This is often caused by an OOM error that causes the connection to the Python REPL to be closed. Check your query's memory usage.

When ever I am trying to run and load multiple files in single dataframe for processing (overall file size is more than 15 gb in single dataframe at the end of the loop, my code is crashing everytime with the below error...ConnectException error: Thi...

Data Engineering

1503 Views
1 replies
0 kudos

05-25-2023 2:57:30 AM

View Replies

Latest Reply

pvignesh92
Honored Contributor

05-26-2023 3:27:55 AM

0 kudos

@Satish Agarwal It seems your system memory is not sufficient to load the 15GB file. I believe you are using Python Pandas data frame for loading 15GB file and not using Spark. Is there any particular reason that you cannot use Spark for this.

0 kudos

05-26-2023 3:27:55 AM

by Yash_542965 • New Contributor II

05-26-2023 3:04:56 AM

466 Views
0 replies
0 kudos

DLT aggregation problem

I'm utilizing SQL to perform aggregation operations within a gold layer of a DLT pipeline. However, I'm encountering an error when running the pipeline while attempting to return a data frame using spark.sql.Could anyone please assist me with the SQL...

Data Engineering

466 Views
0 replies
0 kudos

05-26-2023 3:04:56 AM

by pvignesh92 • Honored Contributor

05-26-2023 12:16:33 AM

510 Views
0 replies
0 kudos

Very often, we need to know how many files my table path contains and the overall size of the path for various optimizations. In the past, I had to wr...

Very often, we need to know how many files my table path contains and the overall size of the path for various optimizations. In the past, I had to write my own logic to accomplish this.Delta Lake is making life easier. See how simple it is to obtain...

Data Engineering

510 Views
0 replies
0 kudos

05-26-2023 12:16:33 AM

by Databricks3 • Contributor

05-25-2023 10:37:40 PM

1189 Views
1 replies
0 kudos

%run is not working

I have created two notebooks in my workspace one(A) is having function and other one(B) is having the main code.I am trying to use the %run magic function to use the functions available in notebook A from notebook B. Both notebooks are on the same pa...

Data Engineering

1189 Views
1 replies
0 kudos

05-25-2023 10:37:40 PM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

05-25-2023 11:15:11 PM

0 kudos

@SK ASIF ALI It's not supported in community if you are using premium version then this should workCan you please send the snippet of the error message? #DAIS2023

0 kudos

05-25-2023 11:15:11 PM

by Justin_Stuparit • New Contributor II

05-25-2023 1:56:53 PM

507 Views
0 replies
0 kudos

Is Delta Live available in Azure Gov Cloud?

I stood a premium databricks instance but don't have the delta tab visible.

Data Engineering

507 Views
0 replies
0 kudos

05-25-2023 1:56:53 PM

by carlosst01 • New Contributor II

05-25-2023 11:15:51 AM

1091 Views
0 replies
0 kudos

java.lang.SecurityException: Could not verify permissions for DeleteFromTable ERROR IN EXTERNAL SYNAPSE TABLE

Hi guys, I am trying to delete an external table in Databricks that is connected to a Synapse table (via the com.databricks.spark.sqldw connector) but got the error shown below:Apart from that, when I try to execute an DELETE and INSERT actions (like...

WhatsApp Image 2023-05-25 at 12.02.52 PM

WhatsApp Image 2023-05-25 at 12.08.16 PM

Data Engineering

1091 Views
0 replies
0 kudos

05-25-2023 11:15:51 AM

by darkraisisi • New Contributor

05-25-2023 8:52:23 AM

445 Views
0 replies
0 kudos

Is there a way to manually update the cuda required file in the db runtime? There are some rather annoying bugs still in TF 2.11 that have been fixed ...

Is there a way to manually update the cuda required file in the db runtime?There are some rather annoying bugs still in TF 2.11 that have been fixed in TF 2.12.Sadly the latest DB runtime 13.1 (beta) only supports the older TF 2.11 even tho 2.12 was ...

Data Engineering

445 Views
0 replies
0 kudos

05-25-2023 8:52:23 AM

by Brian61 • New Contributor

05-25-2023 6:31:47 AM

442 Views
0 replies
0 kudos

I'm getting a strange error whenever I try to execute some unittest tests of a class I have written in python. The error is 'DummyMod' object has no attribute...I don't understand this error and need some assistance to sort it out.

The class is written using Python 3.10 and relies on unittest framework for the tests. I'm running it on DB version 13.0 ML.

Data Engineering

442 Views
0 replies
0 kudos

05-25-2023 6:31:47 AM

by pedroHmdo • New Contributor II

05-10-2023 7:03:59 AM

922 Views
2 replies
3 kudos

Resolved! Why I did not receive the databricks lakehouse fundamentals accreditation badge?

I have passed the test but did not receive the Badge. I also didn't receive any email.Thank you for you attention.

Data Engineering

922 Views
2 replies
3 kudos

05-10-2023 7:03:59 AM

View Replies

Latest Reply

Anonymous
Not applicable

05-25-2023 1:38:00 AM

3 kudos

Hi @Pedro Medeiros Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answer...

3 kudos

05-25-2023 1:38:00 AM

1 More Replies

by Neil • New Contributor

05-24-2023 5:08:10 AM

998 Views
1 replies
0 kudos

While trying to save the spark dataframe to delta table is taking too long

While working on video analytics task I need to save the image bytes to the delta table earlier extracted into the spark dataframe. While I want to over write a same delta table over the period of complete task and also the size of input data differs...

Data Engineering

998 Views
1 replies
0 kudos

05-24-2023 5:08:10 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

05-25-2023 12:52:58 AM

0 kudos

can you check the spark UI, to see where the time is spent?It can be a join, udf, ...

0 kudos

05-25-2023 12:52:58 AM

by FRG96 • New Contributor III

05-24-2023 10:18:09 PM

1177 Views
0 replies
0 kudos

How to set the ABFSS URL for Azure Databricks Init Scripts that have spaces in directory names?

I want to use an Init Script on ADLS Gen2 location for my Azure Databricks 11.3 and 12.2 clusters. The init_script.sh is placed in a directory that has spaces in it:https://storageaccount1.blob.core.windows.net/container1/directory%20with%20spaces/su...

Data Engineering

1177 Views
0 replies
0 kudos

05-24-2023 10:18:09 PM

by Chinu • New Contributor III

05-15-2023 4:45:12 PM

916 Views
1 replies
1 kudos

Resolved! How to create a raw data (with filter_by) to pull query history from now to 5 mins ago

Hi Team, Is it possible I can use "query_start_time_range" filter from the api call to get the query data only from now to 5 mins ago?Im using telegraf to call query history api but it looks like Im reaching the max return and I can't find how to use...

Data Engineering

916 Views
1 replies
1 kudos

05-15-2023 4:45:12 PM

View Replies

Latest Reply

mathan_pillai
Valued Contributor

05-24-2023 2:37:00 PM

1 kudos

Have you checked this https://docs.databricks.com/api-explorer/workspace/queryhistory/list you can list the queries based on time range as well. So you can try passing the fields in the filter_by parameter. Then pass the value as (current time - 5 m...

1 kudos

05-24-2023 2:37:00 PM

by User16783854357 • New Contributor III

06-07-2021 11:24:15 AM

811 Views
1 replies
0 kudos

Delta Sharing - Who provides the server?

I would like to understand who provides the server when using Delta sharing? If a customer exposes their delta table through Delta sharing, is it the customer who needs to setup a cluster or server to process the incoming requests?

Data Engineering

811 Views
1 replies
0 kudos

06-07-2021 11:24:15 AM

View Replies

Latest Reply

BigRoux
New Contributor III

05-24-2023 1:25:13 PM

0 kudos

The producer does need a cluster to set up Delta Sharing. However, once the handoff happens no cluster is needed, the data will be delivered via storage services.

0 kudos

05-24-2023 1:25:13 PM

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Resolved! Community Reward points

Issues when trying to modify log4j.properties

Solution for ConnectException error: This is often caused by an OOM error that causes the connection to the Python REPL to be closed. Check your query's memory usage.

DLT aggregation problem

Very often, we need to know how many files my table path contains and the overall size of the path for various optimizations. In the past, I had to wr...

%run is not working

Is Delta Live available in Azure Gov Cloud?

java.lang.SecurityException: Could not verify permissions for DeleteFromTable ERROR IN EXTERNAL SYNAPSE TABLE

Is there a way to manually update the cuda required file in the db runtime? There are some rather annoying bugs still in TF 2.11 that have been fixed ...

I'm getting a strange error whenever I try to execute some unittest tests of a class I have written in python. The error is 'DummyMod' object has no attribute...I don't understand this error and need some assistance to sort it out.

Resolved! Why I did not receive the databricks lakehouse fundamentals accreditation badge?

While trying to save the spark dataframe to delta table is taking too long

How to set the ABFSS URL for Azure Databricks Init Scripts that have spaces in directory names?

Resolved! How to create a raw data (with filter_by) to pull query history from now to 5 mins ago

Delta Sharing - Who provides the server?

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...