cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Job running time too long

galzamo
New Contributor

Hi all,

I'm doing my first data jobs.

I create one job that consists of 4 other jobs.

Yesterday I ran the 4 jobs separately and it worked fine (about half hour)-

today I ran the big job, and the 4 jobs is running for 2 hours (and still running), 

Why is that happening? I'm using the same compute

 

Thanks!

1 REPLY 1

anardinelli
New Contributor III
New Contributor III

Hello @galzamo how are you?

You can check on the SparkUI for long running stages that might give you a clue where it's spending the most time on each task. Somethings can be the reason:

1. Increase of data and partitions on your source data

2. Cluster concurrency (if you're using a shared cluster with other users)

3. Network and connection issues when connecting to external data sources

If you can share more of your job and the spark logs, we can help you to check.

Best,

Alessandro

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!