Data Engineering

Forum Posts

Sorted by:

by QuantumFries • Visitor

yesterday

30 Views
0 replies
0 kudos

Change {{job.start_time.[iso_date]}} Timezone

I am trying to schedule some jobs using workflows and leveraging dynamic variables. One caveat is that when I try to use {{job.start_time.[iso_date]}} it seems to be defaulted to UTC, is there a way to change it?

Data Engineering

30 Views
0 replies
0 kudos

yesterday

by Tom_Greenwood • New Contributor III

02-01-2024 7:16:54 AM

2063 Views
9 replies
2 kudos

UDF importing from other modules

Hi community,I am using a pyspark udf. The function is being imported from a repo (in the repos section) and registered as a UDF in a the notebook. I am getting a PythonException error when the transformation is run. This is comming from the databric...

Data Engineering

2063 Views
9 replies
2 kudos

02-01-2024 7:16:54 AM

View Replies

Latest Reply

DennisB
New Contributor III

03-14-2024 4:04:03 AM

2 kudos

I was getting a similar error (full traceback below), and determined that it's related to this issue. Setting the env variables DATABRICKS_HOST and DATABRICKS_TOKEN as suggested in that Github issue resolved the problem for me (albeit it's not a grea...

2 kudos

03-14-2024 4:04:03 AM

8 More Replies

by astrobil • New Contributor II

11-15-2023 7:36:21 AM

291 Views
1 replies
0 kudos

Tab Stops Indenting in SQL Editor

I am utilizing Databricks via Azure, and I've been consistently experiencing an issue with the SQL Editor. The tab button, instead of indenting, redirects my cursor to seemingly random parts of the page. This problem has persisted since I began using...

Data Engineering

291 Views
1 replies
0 kudos

11-15-2023 7:36:21 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

yesterday

0 kudos

which DBR version are you using? which web browser are you using?

0 kudos

yesterday

by kartikmnc • New Contributor

12-17-2023 7:34:02 AM

422 Views
1 replies
1 kudos

Regarding Exam got Suspended at middle without any reason.

Hi Team,My Databricks Certified Data Engineer Associate exam got suspended on 17th December and it is in progress state.I was there continuously in front of the camera and suddenly the alert appeared, and support person asked me to show the desk and ...

Data Engineering

422 Views
1 replies
1 kudos

12-17-2023 7:34:02 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

yesterday

1 kudos

Adding @Kaniz for visibility on this request

1 kudos

yesterday

by tariq • New Contributor III

3 weeks ago

267 Views
1 replies
0 kudos

SqlContext in DBR 14.3

I have a Databricks workspace in GCP and I am using the cluster with the Runtime 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12). I am trying to set the checkpoint directory location using the following command in a notebook:spark.sparkContext.set...

Data Engineering

267 Views
1 replies
0 kudos

3 weeks ago

View Replies

Latest Reply

jose_gonzalez
Moderator

yesterday

0 kudos

is this error also happening on other DBR versions or only this version shows this message?

0 kudos

yesterday

by Hubert-Dudek • Esteemed Contributor III

Thursday

113 Views
1 replies
1 kudos

How much USD are you spending on Databricks?

Join two system tables and get exactly how much USD you are spending.The short version of the query: SELECT u.usage_date, u.sku_name, SUM(u.usage_quantity * p.pricing.default) AS total_spent, p.currency_code FROM system.billing....

Data Engineering

113 Views
1 replies
1 kudos

Thursday

View Replies

Latest Reply

jose_gonzalez
Moderator

yesterday

1 kudos

Thank you for sharing this information @Hubert-Dudek

1 kudos

yesterday

by Darian • Visitor

yesterday

49 Views
1 replies
0 kudos

Delta Live table getting error of garbage collection after running few days

Hi, i am using delta live table in continuous mode for a real time streaming data pipeline. After running the pipeline like 2-3 days i am getting this garbage collection error:Driver/10.15.0.73 paused the JVM process 68 seconds during the past 120 se...

Data Engineering

49 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

jose_gonzalez
Moderator

yesterday

0 kudos

could you share the ganglia metrics and how size/type is your driver?

0 kudos

yesterday

by Fresher • New Contributor II

Friday

67 Views
1 replies
0 kudos

Query is taking too long to run

I have two clusters. Cluster A(spark cluster) and cluster B(SQL warehouse). whenever I try to run a particular query using cluster B, it works fine but whenever I try to run same query using cluster A. It's taking time and never show the output

Data Engineering

67 Views
1 replies
0 kudos

Friday

View Replies

Latest Reply

jose_gonzalez
Moderator

yesterday

0 kudos

Check the physical query plan of the query you are running. Also, check the Spark UI to identify where is taking time and why.

0 kudos

yesterday

by shanebo425 • New Contributor

Friday

105 Views
1 replies
0 kudos

Databricks OutOfMemory error on code that previously worked without issue

I have a notebook in Azure Databricks that does some transformations on a bronze tier table and inserts the transformed data into a silver tier table. This notebook is used to do an initial load of the data from our existing system into our new datal...

Data Engineering

105 Views
1 replies
0 kudos

Friday

View Replies

Latest Reply

jose_gonzalez
Moderator

yesterday

0 kudos

Please review your Spark UI from the old job execution versus the new job execution. You might need to check if the data volume has increase and that could be the reason of the OOM

0 kudos

yesterday

by Ruby8376 • Valued Contributor

yesterday

44 Views
0 replies
0 kudos

Databricks sql warehouse has Serverless compute as a public preview.

There is a risk form infosec as it is processed in the control plane shared with other azure clients. s there any control to mitigate the risk?

Data Engineering

44 Views
0 replies
0 kudos

yesterday

by LeoGaller • Visitor

yesterday

55 Views
0 replies
0 kudos

What are the options for "spark_conf.spark.databricks.cluster.profile"?

Hey guys, I'm trying to find what are the options we can pass to spark_conf.spark.databricks.cluster.profileI know looking around that some of the available configs are singleNode and serverless, but there are others?Where is the documentation of it?...

Data Engineering

55 Views
0 replies
0 kudos

yesterday

by Sikki • New Contributor

Friday

181 Views
6 replies
0 kudos

Databricks Asset Bundle Workflow Redeployment Issue

Hello All,In my Databricks workflows, I have three tasks configured, with the final task set to run only if the condition "ALL_DONE" is met. During the first deployment, I observed that the dependency "ALL_DONE" was correctly assigned to the last tas...

Data Engineering

181 Views
6 replies
0 kudos

Friday

View Replies

Latest Reply

Yeshwanth
Valued Contributor

Sunday

0 kudos

Hi @Sikki Good day! There was an issue and it was fixed recently. Could you please confirm if you are still facing the issue? Best regards,

0 kudos

Sunday

5 More Replies

by jainshasha • Visitor

yesterday

32 Views
0 replies
0 kudos

Job Cluster in Databricks workflow

Hi,I have configured 20 different workflows in Databricks. All of them configured with job cluster with different name. All 20 workfldows scheduled to run at same time. But even configuring different job cluster in all of them they run sequentially w...

Data Engineering

32 Views
0 replies
0 kudos

yesterday

by Phani1 • Valued Contributor

yesterday

37 Views
0 replies
0 kudos

Execute Pyspark cells concurrently

Hi Team,Hi Team,Is it feasible to run pyspark cells concurrently in databricks notebooks? If so, kindly provide instructions on how to accomplish this. We aim to execute the intermediate steps simultaneously.The given scenario entails the simultaneou...

Data Engineering

37 Views
0 replies
0 kudos

yesterday

by PrashantAghara • Visitor

yesterday

56 Views
1 replies
0 kudos

org.apache.spark.SparkException: Job aborted due to stage failure when writing to Cosmos

I am writing data to cosmos DB using Python & Spark on DatabricksI am getting below error :org.apache.spark.SparkException: Job aborted due to stage failure: Authorized committer (attemptNumber=0, stage=192, partition=105) failed; but task commit suc...

Data Engineering

56 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

PrashantAghara
Visitor

yesterday

0 kudos

The configs are for cluster:Worker Type & Driver type : Standard_D16ads_v5RUs for Cosmos : 1.5L

0 kudos

yesterday

User

Count

1601

736

343

284

247

Databricks

Forum Posts

Change {{job.start_time.[iso_date]}} Timezone

UDF importing from other modules

Tab Stops Indenting in SQL Editor

Regarding Exam got Suspended at middle without any reason.

SqlContext in DBR 14.3

How much USD are you spending on Databricks?

Delta Live table getting error of garbage collection after running few days

Query is taking too long to run

Databricks OutOfMemory error on code that previously worked without issue

Databricks sql warehouse has Serverless compute as a public preview.

What are the options for "spark_conf.spark.databricks.cluster.profile"?

Databricks Asset Bundle Workflow Redeployment Issue

Job Cluster in Databricks workflow

Execute Pyspark cells concurrently

org.apache.spark.SparkException: Job aborted due to stage failure when writing to Cosmos

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...