cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

DJey
by New Contributor III
  • 13588 Views
  • 6 replies
  • 2 kudos

Resolved! MergeSchema Not Working

Hi All, I have a scenario where my Exisiting Delta Table looks like below:Now I have an incremental data with an additional column i.e. owner:Dataframe Name --> scdDFBelow is the code snippet to merge Incremental Dataframe to targetTable, but the new...

image image image image
  • 13588 Views
  • 6 replies
  • 2 kudos
Latest Reply
Amin112
New Contributor II
  • 2 kudos

In Databricks Runtime 15.2 and above, you can specify schema evolution in a merge statement using SQL or Delta table APIs:MERGE WITH SCHEMA EVOLUTION INTO targetUSING sourceON source.key = target.keyWHEN MATCHED THENUPDATE SET *WHEN NOT MATCHED THENI...

  • 2 kudos
5 More Replies
Anonymous
by Not applicable
  • 3197 Views
  • 1 replies
  • 2 kudos

6.4 Extended Support (includes Apache Spark 2.4.5, Scala 2.11 Connect Timeout

"Notebook detached Exception when creating execution context: java.net.SocketTimeout Exception: Connect Timeout" when trying to connect my cluster to a notebook. Then "Error trying to handle that request We failed to handle that request, please try a...

  • 3197 Views
  • 1 replies
  • 2 kudos
Latest Reply
Wolverine
New Contributor III
  • 2 kudos

Hello @Retired_mod  I am facing same issue I tried changing DBR but it is still giving me error and the cluster is not startingRegardsMS

  • 2 kudos
FarBo
by New Contributor III
  • 6456 Views
  • 4 replies
  • 5 kudos

Spark issue handling data from json when the schema DataType mismatch occurs

Hi,I have encountered a problem using spark, when creating a dataframe from a raw json source.I have defined an schema for my data and the problem is that when there is a mismatch between one of the column values and its defined schema, spark not onl...

  • 6456 Views
  • 4 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

@Farzad Bonabi​ :Thank you for reporting this issue. It seems to be a known bug in Spark when dealing with malformed decimal values. When a decimal value in the input JSON data is not parseable by Spark, it sets not only that column to null but also ...

  • 5 kudos
3 More Replies
brickster_2018
by Databricks Employee
  • 12079 Views
  • 3 replies
  • 6 kudos

Resolved! How to add I custom logging in Databricks

I want to add custom logs that redirect in the Spark driver logs. Can I use the existing logger classes to have my application logs or progress message in the Spark driver logs.

  • 12079 Views
  • 3 replies
  • 6 kudos
Latest Reply
Kaizen
Valued Contributor
  • 6 kudos

1) Is it possible to save all the custom logging to its own file? Currently it is being logging with all other cluster logs (see image) 2) Also Databricks it seems like a lot of blank files are also being created for this. Is this a bug? this include...

  • 6 kudos
2 More Replies
Smitha1
by Valued Contributor II
  • 3962 Views
  • 9 replies
  • 3 kudos

Databricks Certified Associate Developer for Apache Spark 3.0

Databricks Certified Associate Developer for Apache Spark 3.0

  • 3962 Views
  • 9 replies
  • 3 kudos
Latest Reply
Shivam_Patil
New Contributor II
  • 3 kudos

Hey I am looking for sample papers for the above exam other than the one provided by databricks do any one have any idea about it

  • 3 kudos
8 More Replies
brickster_2018
by Databricks Employee
  • 2776 Views
  • 2 replies
  • 0 kudos

Resolved! The driver is temporarily unavailable

My job fails with Driver is temporarily unavailable. Apparently, it's permanently unavailable, because the job is not pausing but failing.

  • 2776 Views
  • 2 replies
  • 0 kudos
Latest Reply
Chalki
New Contributor III
  • 0 kudos

I am facing the same issues .  I am writing in batches using a simple for loop. I don't have any collect statements inside the loop. I am rewriting the partitions with partition overwrite dynamic mode in a huge wide delta table - several tb. The incr...

  • 0 kudos
1 More Replies
Smitha1
by Valued Contributor II
  • 6677 Views
  • 10 replies
  • 9 kudos

Resolved! Request for reattempt voucher. Databricks Certified Associate Developer for Apache Spark 3.0 exam

Hi,I gave Databricks Certified Associate Developer for Apache Spark 3.0 exam today but missed by one percent. I got 68.33% and pass is 70%.I am planning to reattempt the exam, could you kindly give me another opportunity and provide reattempt voucher...

  • 6677 Views
  • 10 replies
  • 9 kudos
Latest Reply
shriya
New Contributor II
  • 9 kudos

Hi,I gave Databricks Certified Associate Developer for Apache Spark 3.0 Python exam yesterday but missed by three percent. I got 66.66% and pass is 70%.I am planning to reattempt the exam, could you kindly give me another opportunity and provide reat...

  • 9 kudos
9 More Replies
Sujitha
by Databricks Employee
  • 2033 Views
  • 3 replies
  • 2 kudos

KB Feedback Discussion In addition to the Databricks Community, we have a Support team that maintains a Knowledge Base (KB). The KB contains answers t...

KB Feedback DiscussionIn addition to the Databricks Community, we have a Support team that maintains a Knowledge Base (KB). The KB contains answers to common questions about Databricks, as well as information on optimisation and troubleshooting.These...

  • 2033 Views
  • 3 replies
  • 2 kudos
Latest Reply
martinez
New Contributor III
  • 2 kudos

Thanks for sharing!  

  • 2 kudos
2 More Replies
Vsleg
by Contributor
  • 4292 Views
  • 5 replies
  • 3 kudos

Resolved! Issue with Apache Sparkâ„¢ Programming with Databricks course

Hello,I found an issue with the Apache Sparkâ„¢ Programming with Databricks courses on Databricks Academy when trying to do the labs. The mount that the courses use for training data is failing with what looks to me like an authentication issue (see sc...

image
  • 4292 Views
  • 5 replies
  • 3 kudos
Latest Reply
Vsleg
Contributor
  • 3 kudos

I found the course Git Repo at (https://github.com/databricks-academy/apache-spark-programming-with-databricks-english), this works so using that instead of the 'apache-spark-programming-with-databricks.dbc' file available in the learning portal. #DA...

  • 3 kudos
4 More Replies
sevvalmehder
by New Contributor II
  • 2309 Views
  • 3 replies
  • 3 kudos

Databricks run-time 12.2 LTS drop function problem

I am getting an error about the `drop function of pyspark` at a cluster using 12.2 LTS. When I check the error I see spark solved that bug, see SPARK-42444. Also when I check maintenance updates page, I saw this solved issue included the Databricks R...

image.png
  • 2309 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Sevval Mehder​ Elevate our community by acknowledging exceptional contributions. Your participation in marking the best answers is a testament to our collective pursuit of knowledge.

  • 3 kudos
2 More Replies
PriyaV
by New Contributor II
  • 12710 Views
  • 5 replies
  • 10 kudos

Suppress output in python notebooks

My dilemma is this - We use PySpark to connect to external data sources via jdbc from within databricks. Every time we issue a spark command, it spits out the connection options including the username, url and password which is not advisable. So, is ...

  • 12710 Views
  • 5 replies
  • 10 kudos
Latest Reply
Pabeggetur
New Contributor II
  • 10 kudos

Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic.youi contact hoursuber eats complaints

  • 10 kudos
4 More Replies
MohamedThanveer
by New Contributor II
  • 1078 Views
  • 1 replies
  • 0 kudos

Databricks Certified Associate Developer for Apache Spark 3.0 - Python Cancellation

I have scheduled an examination on 1st June 2023 and due to personal reason, I have cancelled the examination on 26th May 2023 (more than 72 hours) but I am yet to receive the refund amount. In the auto generated mail it is mentioned that the refund ...

image
  • 1078 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

adding @Suteja Kanuri​  and @Vidula Khanna​ for visibility

  • 0 kudos
Nis
by New Contributor II
  • 1580 Views
  • 1 replies
  • 2 kudos

Best sequence of using Vacuum, optimize, fsck repair and refresh commands.

I have a delta table whose size will increases gradually now we have around 1.5 crores of rows while running vacuum command on that table i am getting the below error.ERROR: Job aborted due to stage failure: Task 7 in stage 491.0 failed 4 times, most...

  • 1580 Views
  • 1 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 2 kudos

Do you have access to the Executor 7 logs? is there a high GC or some other events that is making the heartbeat timeout? would you be able to check the failed stages?

  • 2 kudos
Jujiro
by New Contributor III
  • 8994 Views
  • 11 replies
  • 7 kudos

Random error: At least one column must be specified for the table?

I have the following code in a notebook. It is randomly giving me the error, "At least one column must be specified for the table." The error occurs (if at all it occurs) only on the first run after attaching to a cluster.Cluster details:Summary5-1...

dbr-bug
  • 8994 Views
  • 11 replies
  • 7 kudos
Latest Reply
Harold
New Contributor II
  • 7 kudos

Please check if this could help or not:spark.databricks.delta.catalog.update.enabled false

  • 7 kudos
10 More Replies
Labels