cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

RS1
by New Contributor III
  • 709 Views
  • 1 replies
  • 1 kudos

I attended the Advanced Machine Learning with Databricks training last week virtually I am still unable to get the day 2 session videos of any of the...

I attended the Advanced Machine Learning with Databricks training last week virtually I am still unable to get the day 2 session videos of any of the Instructor led Paid Trainings. They are supposed to be available for replay with in 24 hours but I ...

  • 709 Views
  • 1 replies
  • 1 kudos
Latest Reply
murali9
New Contributor II
  • 1 kudos

I have the same problem.

  • 1 kudos
FarBo
by New Contributor III
  • 8248 Views
  • 5 replies
  • 5 kudos

Spark issue handling data from json when the schema DataType mismatch occurs

Hi,I have encountered a problem using spark, when creating a dataframe from a raw json source.I have defined an schema for my data and the problem is that when there is a mismatch between one of the column values and its defined schema, spark not onl...

  • 8248 Views
  • 5 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

@Farzad Bonabi​ :Thank you for reporting this issue. It seems to be a known bug in Spark when dealing with malformed decimal values. When a decimal value in the input JSON data is not parseable by Spark, it sets not only that column to null but also ...

  • 5 kudos
4 More Replies
brickster_2018
by Databricks Employee
  • 5582 Views
  • 4 replies
  • 2 kudos

Resolved! Databricks Spark Vs Spark on Yarn

I am moving my Spark workloads from EMR/on-premise Spark cluster to Databricks. I understand Databricks Spark is different from Yarn. How is the Databricks architecture different from yarn?

  • 5582 Views
  • 4 replies
  • 2 kudos
Latest Reply
de-qrosh
New Contributor III
  • 2 kudos

What about the disadvantages?How can I separate multiple jobs running on the same cluster cleanly in the logs and same in the spark-ui?

  • 2 kudos
3 More Replies
DJey
by New Contributor III
  • 15408 Views
  • 6 replies
  • 2 kudos

Resolved! MergeSchema Not Working

Hi All, I have a scenario where my Exisiting Delta Table looks like below:Now I have an incremental data with an additional column i.e. owner:Dataframe Name --> scdDFBelow is the code snippet to merge Incremental Dataframe to targetTable, but the new...

image image image image
  • 15408 Views
  • 6 replies
  • 2 kudos
Latest Reply
Amin112
New Contributor II
  • 2 kudos

In Databricks Runtime 15.2 and above, you can specify schema evolution in a merge statement using SQL or Delta table APIs:MERGE WITH SCHEMA EVOLUTION INTO targetUSING sourceON source.key = target.keyWHEN MATCHED THENUPDATE SET *WHEN NOT MATCHED THENI...

  • 2 kudos
5 More Replies
Anonymous
by Not applicable
  • 3429 Views
  • 1 replies
  • 2 kudos

6.4 Extended Support (includes Apache Spark 2.4.5, Scala 2.11 Connect Timeout

"Notebook detached Exception when creating execution context: java.net.SocketTimeout Exception: Connect Timeout" when trying to connect my cluster to a notebook. Then "Error trying to handle that request We failed to handle that request, please try a...

  • 3429 Views
  • 1 replies
  • 2 kudos
Latest Reply
Wolverine
New Contributor III
  • 2 kudos

Hello @Retired_mod  I am facing same issue I tried changing DBR but it is still giving me error and the cluster is not startingRegardsMS

  • 2 kudos
brickster_2018
by Databricks Employee
  • 13326 Views
  • 3 replies
  • 6 kudos

Resolved! How to add I custom logging in Databricks

I want to add custom logs that redirect in the Spark driver logs. Can I use the existing logger classes to have my application logs or progress message in the Spark driver logs.

  • 13326 Views
  • 3 replies
  • 6 kudos
Latest Reply
Kaizen
Valued Contributor
  • 6 kudos

1) Is it possible to save all the custom logging to its own file? Currently it is being logging with all other cluster logs (see image) 2) Also Databricks it seems like a lot of blank files are also being created for this. Is this a bug? this include...

  • 6 kudos
2 More Replies
Smitha1
by Valued Contributor II
  • 4426 Views
  • 9 replies
  • 3 kudos

Databricks Certified Associate Developer for Apache Spark 3.0

Databricks Certified Associate Developer for Apache Spark 3.0

  • 4426 Views
  • 9 replies
  • 3 kudos
Latest Reply
Shivam_Patil
New Contributor II
  • 3 kudos

Hey I am looking for sample papers for the above exam other than the one provided by databricks do any one have any idea about it

  • 3 kudos
8 More Replies
brickster_2018
by Databricks Employee
  • 3169 Views
  • 2 replies
  • 0 kudos

Resolved! The driver is temporarily unavailable

My job fails with Driver is temporarily unavailable. Apparently, it's permanently unavailable, because the job is not pausing but failing.

  • 3169 Views
  • 2 replies
  • 0 kudos
Latest Reply
Chalki
New Contributor III
  • 0 kudos

I am facing the same issues .  I am writing in batches using a simple for loop. I don't have any collect statements inside the loop. I am rewriting the partitions with partition overwrite dynamic mode in a huge wide delta table - several tb. The incr...

  • 0 kudos
1 More Replies
Smitha1
by Valued Contributor II
  • 7523 Views
  • 10 replies
  • 9 kudos

Resolved! Request for reattempt voucher. Databricks Certified Associate Developer for Apache Spark 3.0 exam

Hi,I gave Databricks Certified Associate Developer for Apache Spark 3.0 exam today but missed by one percent. I got 68.33% and pass is 70%.I am planning to reattempt the exam, could you kindly give me another opportunity and provide reattempt voucher...

  • 7523 Views
  • 10 replies
  • 9 kudos
Latest Reply
shriya
New Contributor II
  • 9 kudos

Hi,I gave Databricks Certified Associate Developer for Apache Spark 3.0 Python exam yesterday but missed by three percent. I got 66.66% and pass is 70%.I am planning to reattempt the exam, could you kindly give me another opportunity and provide reat...

  • 9 kudos
9 More Replies
Sujitha
by Databricks Employee
  • 2253 Views
  • 3 replies
  • 2 kudos

KB Feedback Discussion In addition to the Databricks Community, we have a Support team that maintains a Knowledge Base (KB). The KB contains answers t...

KB Feedback DiscussionIn addition to the Databricks Community, we have a Support team that maintains a Knowledge Base (KB). The KB contains answers to common questions about Databricks, as well as information on optimisation and troubleshooting.These...

  • 2253 Views
  • 3 replies
  • 2 kudos
Latest Reply
martinez
New Contributor III
  • 2 kudos

Thanks for sharing!  

  • 2 kudos
2 More Replies
Vsleg
by Contributor
  • 4944 Views
  • 5 replies
  • 3 kudos

Resolved! Issue with Apache Spark™ Programming with Databricks course

Hello,I found an issue with the Apache Spark™ Programming with Databricks courses on Databricks Academy when trying to do the labs. The mount that the courses use for training data is failing with what looks to me like an authentication issue (see sc...

image
  • 4944 Views
  • 5 replies
  • 3 kudos
Latest Reply
Vsleg
Contributor
  • 3 kudos

I found the course Git Repo at (https://github.com/databricks-academy/apache-spark-programming-with-databricks-english), this works so using that instead of the 'apache-spark-programming-with-databricks.dbc' file available in the learning portal. #DA...

  • 3 kudos
4 More Replies
sevvalmehder
by New Contributor II
  • 2638 Views
  • 3 replies
  • 3 kudos

Databricks run-time 12.2 LTS drop function problem

I am getting an error about the `drop function of pyspark` at a cluster using 12.2 LTS. When I check the error I see spark solved that bug, see SPARK-42444. Also when I check maintenance updates page, I saw this solved issue included the Databricks R...

image.png
  • 2638 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Sevval Mehder​ Elevate our community by acknowledging exceptional contributions. Your participation in marking the best answers is a testament to our collective pursuit of knowledge.

  • 3 kudos
2 More Replies
PriyaV
by New Contributor II
  • 15178 Views
  • 5 replies
  • 10 kudos

Suppress output in python notebooks

My dilemma is this - We use PySpark to connect to external data sources via jdbc from within databricks. Every time we issue a spark command, it spits out the connection options including the username, url and password which is not advisable. So, is ...

  • 15178 Views
  • 5 replies
  • 10 kudos
Latest Reply
Pabeggetur
New Contributor II
  • 10 kudos

Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic.youi contact hoursuber eats complaints

  • 10 kudos
4 More Replies
MohamedThanveer
by New Contributor II
  • 1257 Views
  • 1 replies
  • 0 kudos

Databricks Certified Associate Developer for Apache Spark 3.0 - Python Cancellation

I have scheduled an examination on 1st June 2023 and due to personal reason, I have cancelled the examination on 26th May 2023 (more than 72 hours) but I am yet to receive the refund amount. In the auto generated mail it is mentioned that the refund ...

image
  • 1257 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

adding @Suteja Kanuri​  and @Vidula Khanna​ for visibility

  • 0 kudos
Labels