cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Spark in not executing any tasks

Diogo_W
New Contributor III

I have an issue where Spark in not submiting any task, on any worksapce or cluster, even SQLWarehouse.

Even for very simple code it hangs forever.

Diogo_W_0-1698352974280.png

Diogo_W_1-1698353051402.png

Anyone ever faced something similar? Our infra is AWS.

 

1 ACCEPTED SOLUTION

Accepted Solutions

Diogo_W
New Contributor III

Found the solution:

 

Turned out to be an issue with the Security Groups. The internal security group communication was not open to all ports for TCP and UDP. After fixing that the jobs ran fine. Seems like we did require more workers too.

View solution in original post

3 REPLIES 3

Diogo_W
New Contributor III

Hi Kaniz, thanks for the reply.

I went thought the log and I see this:

KeyboardInterrupt:
23/10/26 21:06:04 INFO ProgressReporter$: Removed result fetcher for 7389618138579564799_6933402728921115182_ee7173b16c654fea9ca6968ef33e5530
23/10/26 21:06:04 INFO PythonDriverWrapper: Stopping streams for commandId pattern: CommandIdPattern(7389618138579564799,None,Some(ee7173b16c654fea9ca6968ef33e5530)).
23/10/26 21:06:06 INFO ClusterLoadAvgHelper: Current cluster load: 0, Old Ema: 1.0, New Ema: 0.85
23/10/26 21:06:08 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
23/10/26 21:06:09 INFO ClusterLoadAvgHelper: Current cluster load: 0, Old Ema: 0.85, New Ema: 0.0
23/10/26 21:06:23 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Looks like the cluster is not getting enough resources like you mentioned. Any idea how to fix it?

Diogo_W
New Contributor III

Found the solution:

 

Turned out to be an issue with the Security Groups. The internal security group communication was not open to all ports for TCP and UDP. After fixing that the jobs ran fine. Seems like we did require more workers too.

Hi

We are facing a similar like issue, if you don't mind can you share how did you fix the ports? We are using GCP

Thanks