cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Job Concurrency Queue not working as expected

arielmoraes
New Contributor III

I have a process that should run the same notebook with varying parameters, thus translating to a job with queue and concurrency enabled. When the first executions are triggered the Jobs Runs work as expected, i.e. if the job has a max concurrency set to 10, all executions start running concurrently as expected and new executions are queued up.

arielmoraes_0-1696872175101.png

So, after the current concurrent executions end it's expected the queued items to start being processed as the slots are being freed up, but what is happening is only a few queued items execute concurrently after the first ones finish:

arielmoraes_1-1696872724206.png

Most of the time only a couple of jobs run concurrently after the first batch. 

It appears to be a bug in the queueing mechanism or am I missing something?

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

arielmoraes
New Contributor III

Hi @Kaniz, we double-checked everything, the resources are enough and all settings are properly set. I'll reach out the support by filing a new ticket. Thank you for your help.

View solution in original post

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @arielmoraes , It's difficult to say definitively if there's a bug in the queueing mechanism.

However, there are a few things you could check:

1. **Cluster resources**: Ensure that your cluster has enough resources to run the jobs concurrently. If resources are insufficient, jobs may be queued and not run concurrently as expected.

2. **Job configuration**: Check the configuration of your jobs. Make sure that the "Maximum concurrent runs" is set to the desired number.

As per the information provided, for Structured Streaming jobs, it is recommended to set "Maximum concurrent runs" to 1, but in your case, you might need to set it to a higher value (like 10 as you mentioned).

3. **Job retries**: If a job fails, it might be retried based on your job configuration, which could affect the number of concurrent running jobs. 

If you've checked these aspects and the problem persists, it might be worth reaching out to Databricks support by filing a support ticket for further assistance.

Kaniz
Community Manager
Community Manager

Hi @arielmoraes , Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!

arielmoraes
New Contributor III

Hi @Kaniz, we double-checked everything, the resources are enough and all settings are properly set. I'll reach out the support by filing a new ticket. Thank you for your help.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.