cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Tom_Greenwood
by New Contributor III
  • 2063 Views
  • 9 replies
  • 2 kudos

UDF importing from other modules

Hi community,I am using a pyspark udf. The function is being imported from a repo (in the repos section) and registered as a UDF in a the notebook. I am getting a PythonException error when the transformation is run. This is comming from the databric...

Tom_Greenwood_0-1706798998837.png
  • 2063 Views
  • 9 replies
  • 2 kudos
Latest Reply
DennisB
New Contributor III
  • 2 kudos

I was getting a similar error (full traceback below), and determined that it's related to this issue. Setting the env variables DATABRICKS_HOST and DATABRICKS_TOKEN as suggested in that Github issue resolved the problem for me (albeit it's not a grea...

  • 2 kudos
8 More Replies
astrobil
by New Contributor II
  • 291 Views
  • 1 replies
  • 0 kudos

Tab Stops Indenting in SQL Editor

I am utilizing Databricks via Azure, and I've been consistently experiencing an issue with the SQL Editor. The tab button, instead of indenting, redirects my cursor to seemingly random parts of the page. This problem has persisted since I began using...

  • 291 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

which DBR version are you using? which web browser are you using?

  • 0 kudos
kartikmnc
by New Contributor
  • 422 Views
  • 1 replies
  • 1 kudos

Regarding Exam got Suspended at middle without any reason.

Hi Team,My Databricks Certified Data Engineer Associate exam got suspended on 17th December and it is in progress state.I was there continuously in front of the camera and suddenly the alert appeared, and support person asked me to show the desk and ...

  • 422 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Moderator
  • 1 kudos

Adding @Kaniz for visibility on this request

  • 1 kudos
tariq
by New Contributor III
  • 267 Views
  • 1 replies
  • 0 kudos

SqlContext in DBR 14.3

I have a Databricks workspace in GCP and I am using the cluster with the Runtime 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12). I am trying to set the checkpoint directory location using the following command in a notebook:spark.sparkContext.set...

  • 267 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

is this error also happening on other DBR versions or only this version shows this message?

  • 0 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 113 Views
  • 1 replies
  • 1 kudos

How much USD are you spending on Databricks?

Join two system tables and get exactly how much USD you are spending.The short version of the query: SELECT u.usage_date, u.sku_name, SUM(u.usage_quantity * p.pricing.default) AS total_spent, p.currency_code FROM system.billing....

system_pig.png
  • 113 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Moderator
  • 1 kudos

Thank you for sharing this information @Hubert-Dudek 

  • 1 kudos
Darian
by Visitor
  • 49 Views
  • 1 replies
  • 0 kudos

Delta Live table getting error of garbage collection after running few days

Hi, i am using delta live table in continuous mode for a real time streaming data pipeline. After running the pipeline like 2-3 days i am getting this garbage collection error:Driver/10.15.0.73 paused the JVM process 68 seconds during the past 120 se...

Darian_0-1714426883477.png Darian_1-1714426964675.png
  • 49 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

could you share the ganglia metrics  and how size/type is your driver?

  • 0 kudos
Fresher
by New Contributor II
  • 67 Views
  • 1 replies
  • 0 kudos

Query is taking too long to run

I have two clusters. Cluster A(spark cluster) and cluster B(SQL warehouse). whenever I try to run a particular query using cluster B, it works fine but whenever I try to run same query using cluster A. It's taking time and never show the output

  • 67 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Check the physical query plan of the query you are running. Also, check the Spark UI to identify where is taking time and why.

  • 0 kudos
shanebo425
by New Contributor
  • 105 Views
  • 1 replies
  • 0 kudos

Databricks OutOfMemory error on code that previously worked without issue

I have a notebook in Azure Databricks that does some transformations on a bronze tier table and inserts the transformed data into a silver tier table. This notebook is used to do an initial load of the data from our existing system into our new datal...

  • 105 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Please review your Spark UI from the old job execution versus the new job execution. You might need to check if the data volume has increase and that could be the reason of the OOM

  • 0 kudos
Sikki
by New Contributor
  • 181 Views
  • 6 replies
  • 0 kudos

Databricks Asset Bundle Workflow Redeployment Issue

Hello All,In my Databricks workflows, I have three tasks configured, with the final task set to run only if the condition "ALL_DONE" is met. During the first deployment, I observed that the dependency "ALL_DONE" was correctly assigned to the last tas...

  • 181 Views
  • 6 replies
  • 0 kudos
Latest Reply
Yeshwanth
Valued Contributor
  • 0 kudos

Hi @Sikki Good day! There was an issue and it was fixed recently. Could you please confirm if you are still facing the issue? Best regards,

  • 0 kudos
5 More Replies
jainshasha
by Visitor
  • 32 Views
  • 0 replies
  • 0 kudos

Job Cluster in Databricks workflow

Hi,I have configured 20 different workflows in Databricks. All of them configured with job cluster with different name. All 20 workfldows scheduled to run at same time. But even configuring different job cluster in all of them they run sequentially w...

  • 32 Views
  • 0 replies
  • 0 kudos
Phani1
by Valued Contributor
  • 37 Views
  • 0 replies
  • 0 kudos

Execute Pyspark cells concurrently

Hi Team,Hi Team,Is it feasible to run pyspark cells concurrently in databricks notebooks? If so, kindly provide instructions on how to accomplish this. We aim to execute the intermediate steps simultaneously.The given scenario entails the simultaneou...

  • 37 Views
  • 0 replies
  • 0 kudos
PrashantAghara
by Visitor
  • 56 Views
  • 1 replies
  • 0 kudos

org.apache.spark.SparkException: Job aborted due to stage failure when writing to Cosmos

I am writing data to cosmos DB using Python & Spark on DatabricksI am getting below error :org.apache.spark.SparkException: Job aborted due to stage failure: Authorized committer (attemptNumber=0, stage=192, partition=105) failed; but task commit suc...

  • 56 Views
  • 1 replies
  • 0 kudos
Latest Reply
PrashantAghara
  • 0 kudos

The configs are for cluster:Worker Type & Driver type : Standard_D16ads_v5RUs for Cosmos : 1.5L

  • 0 kudos
Labels
Top Kudoed Authors