cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Tacuma
by New Contributor II
  • 995 Views
  • 4 replies
  • 1 kudos

Scheduling jobs with Airflow result in each task running multiple jobs.

Hey everyone, I'm experiementing with running containerized pyspark jobs in Databricks, and orchestrating them with airflow. I am however, encountering an issue here. When I trigger an airflow DAG, and I look at the logs, I see that airflow is spinni...

  • 995 Views
  • 4 replies
  • 1 kudos
Latest Reply
Tacuma
New Contributor II
  • 1 kudos

Both, I guess? Yes, all jobs share the same config - the question I have is why in the same airflow task log, there are 3 jobs runs. I'm hoping that there's something in the configs and may give me some kind of clue.

  • 1 kudos
3 More Replies
JordanYaker
by Contributor
  • 2995 Views
  • 7 replies
  • 8 kudos

Resolved! Is anyone else experiencing intermittent "Failure starting REPL" errors with PySpark Jobs?

I have a Multi-Task Job that is running a bunch of PySpark notebooks and about 30-60% of the time, my jobs fail with the following error:I haven't seen any consistency with this error. I've had as many as all of the tasks in the job giving this error...

image.png
  • 2995 Views
  • 7 replies
  • 8 kudos
Latest Reply
James_Cole
New Contributor III
  • 8 kudos

Hi. Did you ever got a resolution to this problem outside of rolling back to 10.4? I have recently moved some workloads over to runtime 11.3 and am experiencing intermittent "repl did not start in 30 seconds." errors.I have increased the repl timeout...

  • 8 kudos
6 More Replies
Labels