cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Job running time too long

galzamo
New Contributor

Hi all,

I'm doing my first data jobs.

I create one job that consists of 4 other jobs.

Yesterday I ran the 4 jobs separately and it worked fine (about half hour)-

today I ran the big job, and the 4 jobs is running for 2 hours (and still running), 

Why is that happening? I'm using the same compute

 

Thanks!

1 REPLY 1

anardinelli
Databricks Employee
Databricks Employee

Hello @galzamo how are you?

You can check on the SparkUI for long running stages that might give you a clue where it's spending the most time on each task. Somethings can be the reason:

1. Increase of data and partitions on your source data

2. Cluster concurrency (if you're using a shared cluster with other users)

3. Network and connection issues when connecting to external data sources

If you can share more of your job and the spark logs, we can help you to check.

Best,

Alessandro

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group