cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

fabio2352
by Contributor
  • 1129 Views
  • 0 replies
  • 1 kudos

evidences_pass

I haven't received my Databricks Certified Data Engineer AssociateI have pass my certification exam, Databricks Certified Data Engineer Associate on 27 October 2022.. I am yet to receive a certificate or badge. Any help is much appreciated. I have a ...

  • 1129 Views
  • 0 replies
  • 1 kudos
elgeo
by Valued Contributor II
  • 1718 Views
  • 0 replies
  • 5 kudos

Clean up _delta_log files

Hello experts. We are trying to clarify how to clean up the large amount of files that are being accumulated in the _delta_log folder (json, crc and checkpoint files). We went through the related posts in the forum and followed the below:SET spark.da...

  • 1718 Views
  • 0 replies
  • 5 kudos
327753
by New Contributor III
  • 1966 Views
  • 4 replies
  • 6 kudos

Resolved! Using the %debug magic in DataBricks notebook

When developing locally, I can write %debug in a new cell after encountering an error, and jump into the function that the error originated from. In Databricks, this freezes the notebook indefinitely.For example:In [1]:def query_data(): df_full = qu...

  • 1966 Views
  • 4 replies
  • 6 kudos
Latest Reply
327753
New Contributor III
  • 6 kudos

I just upgraded my personal node and %debug worked! I appreciate the reminder to use pdb() itself when appropriate too. I'm still interested in whether we should have any concerns about upgrading our main cluster - please do let me know, and then I'l...

  • 6 kudos
3 More Replies
ebyhr
by New Contributor II
  • 4901 Views
  • 5 replies
  • 3 kudos

How to fix intermittent 503 errors in 10.4 LTS

I sometimes get the below error recently in version 10.4 LTS. Any solution to fix the intermittent failure? I added retry logic in our code, but Databricks query succeeded (even though it threw an exception) and it leads to the unexpected table statu...

  • 4901 Views
  • 5 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Yuya Ebihara​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 3 kudos
4 More Replies
alejandrofm
by Valued Contributor
  • 569 Views
  • 0 replies
  • 2 kudos

How can I know if an instance has fallen back to On-demand?

Hi, I have several clusters, some with a 45% max spot price, some more important with a higher value. Want to know what is the best way to configure this but cannot find anything (a value of how many nodes of the last run were On-demand will do the t...

  • 569 Views
  • 0 replies
  • 2 kudos
joshi
by New Contributor II
  • 883 Views
  • 2 replies
  • 2 kudos

Full screensmode is not working for spark course, Is there anyone who tried the same and facing the same issue .

Full screensmode is not working for spark course, Is there anyone who tried the same and facing the same issue .

image
  • 883 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Abhishek Joshi​ Does @Hubert Dudek​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 2 kudos
1 More Replies
djfliu
by New Contributor III
  • 1313 Views
  • 3 replies
  • 4 kudos

Help optimizing large empty gaps where no executors are running jobs in Spark UI. Structured streaming writing.

Hi, I'm running a structured streaming job on a pipeline with a medallion architecture. In my silver layer, we are reading from the bronze layer using structured streaming, and writing the stream to the silver layer w/ a foreachbatch function doing s...

  • 1313 Views
  • 3 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Danny Liu​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 4 kudos
2 More Replies
Prakash0811
by New Contributor II
  • 494 Views
  • 0 replies
  • 2 kudos

What are the Delta Live Table migration advantages?

Currently we are using medallion architecture using delta tables in the form of notebooks and jobs.1) what is the advantage of migrating the existing implementation to Delta Live Tables? 2) what kind of efforts involved in the migration?3) will the m...

  • 494 Views
  • 0 replies
  • 2 kudos
StephanieRivera
by Valued Contributor II
  • 5706 Views
  • 3 replies
  • 3 kudos

Resolved! How do I fix tabs vs spaces in notebooks?

I am getting IndentationError: unindent does not match any outer indentation levelbecause the code I pasted has tabs that are 4 spaces, but the tabs in Databricks are 2 spaces. How do I fix this? Do I have to copy and paste it back out?

  • 5706 Views
  • 3 replies
  • 3 kudos
Latest Reply
Zainaboladokun
New Contributor III
  • 3 kudos

Nopu

  • 3 kudos
2 More Replies
ImAbhishekTomar
by New Contributor III
  • 6600 Views
  • 6 replies
  • 4 kudos

kafkashaded.org.apache.kafka.common.errors.TimeoutException: topic-downstream-data-nonprod not present in metadata after 60000 ms.

I am facing an error when trying to write data on Kafka using spark stream.#Extract source_stream_df= (spark.readStream .format("cosmos.oltp.changeFeed") .option("spark.cosmos.container", PARM_CONTAINER_NAME) .option("spark.cosmos.read.inferSchema.en...

  • 6600 Views
  • 6 replies
  • 4 kudos
Latest Reply
Zainaboladokun
New Contributor III
  • 4 kudos

BIU$I

  • 4 kudos
5 More Replies
HAmera
by New Contributor III
  • 1922 Views
  • 4 replies
  • 11 kudos

using ipywidgets in azure databricks dashboards

Is it possible to use ipywidgets in azure databricks dashboards?

  • 1922 Views
  • 4 replies
  • 11 kudos
Latest Reply
Anonymous
Not applicable
  • 11 kudos

Hi @Hossein Amirinia​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.T...

  • 11 kudos
3 More Replies
alejandrofm
by Valued Contributor
  • 1645 Views
  • 4 replies
  • 2 kudos

Resolved! Orphan (?) files on Databricks S3 bucket

Hi, I'm seeing a lot of empty (and not) directories on routes like:xxxxxx.jobs/FileStore/job-actionstats/xxxxxx.jobs/FileStore/job-result/xxxxxx.jobs/command-results/Can I create a lifecycle to delete old objects (files/directories)? how many days? w...

  • 1645 Views
  • 4 replies
  • 2 kudos
Latest Reply
alejandrofm
Valued Contributor
  • 2 kudos

Hi! I didn't know that, Purging right now, is there a way to schedule that so logs are retained for less time? Maybe I want to maintain the last 7 days for everything?Thanks!

  • 2 kudos
3 More Replies
rt2
by New Contributor III
  • 1036 Views
  • 2 replies
  • 3 kudos

Resolved! Fundamentals of Databricks Lakehouse Badge not recieved.

I passed the databricks fundamental exam and like many others I too did not recieved my badge.I am very much intrested in putting this badge on my linkedin profile, please help.My email id is: rahul.psit.ec@gmail.comWhich databricks is resolving as: ...

  • 1036 Views
  • 2 replies
  • 3 kudos
Latest Reply
rt2
New Contributor III
  • 3 kudos

I got the badge now. Thanks.

  • 3 kudos
1 More Replies
r-g-s-j
by New Contributor
  • 1651 Views
  • 1 replies
  • 0 kudos

How to Configure PySpark Jobs Using PEX

IssueI am attempting to create a PySpark job via the Databricks UI (with spark-submit) using the parameters below (dependencies are on the PEX file), but I am getting the an exception that the pex file does not exist. It's my understanding that the -...

  • 1651 Views
  • 1 replies
  • 0 kudos
Latest Reply
franck
New Contributor II
  • 0 kudos

Hi,I'm facing the same issue trying to execute a pyspark job with spark-submit.I have explored the same solution as you : --files optionspark.pyspark.driver.pythonspark.executorEnv.PEX_ROOTDo you make some progress in the resolution of the problem ?

  • 0 kudos
karthik_p
by Esteemed Contributor
  • 1770 Views
  • 5 replies
  • 8 kudos

odbc connectivity Issues with Databricks when we are out of VPN in GCP

HI Team,We are getting below error when we are trying to connect our tool by using ODBC connection with out logging to VPN, when we are in VPN we are not getting below issue.[Simba][ThriftExtension] (14) Unexpected response from server during a HTTP ...

  • 1770 Views
  • 5 replies
  • 8 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 8 kudos

@Kaniz Fatma​ our team working with data bricks, we can close this thread

  • 8 kudos
4 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels