topic Re: Using a cluster of type SINGLE_USER to run parallel python tasks in one job in Data Engineering

Using a cluster of type SINGLE_USER to run parallel python tasks in one job

oye — Fri, 21 Nov 2025 17:28:37 GMT

Hi,

I have set up a job of multiple spark python tasks running in parallel. I have only set up one job cluster, single node, data security mode SINGLE_USER, using Databricks Runtime version 14.3.x-scala2.12.

These parallel spark python tasks share some similar variable names, but they are not technically global variables, everything is defined under one main function per file.

Will the python tasks somehow share these variables since I am using the same clusters? Can this ever happen using Databricks cluster?

Re: Using a cluster of type SINGLE_USER to run parallel python tasks in one job

Coffee77 — Fri, 21 Nov 2025 21:00:33 GMT

Not sure to understand completely but If you are running parallel tasks with each task executed in a given notebook with same variable names, answer is no. The scope of those variables is kind of the spark session or notebook, not the cluster.

To share "data" at cluster level you can use Cluster-Scoped Environment Variables, Global Temp Views, Databricks Secrets for confidential data or even Shared files.

Re: Using a cluster of type SINGLE_USER to run parallel python tasks in one job

oye — Mon, 24 Nov 2025 08:57:56 GMT

Hi thanks for replying!

In my case, it would be running parallel tasks of type spark python tasks in a lakeflow job. This is a screenshot of the setup:

Aside from the fact that the tasks will share the same resource and thus might run slower, I wonder if there could be any other problem from cluster sharing.

But going from what you said, then there should not be any problem for my setup.

Re: Using a cluster of type SINGLE_USER to run parallel python tasks in one job

Coffee77 — Mon, 24 Nov 2025 09:25:21 GMT

In my case, we've some jobs configured in a similar way and not issues so far. We are indeed leveraging usage of global temp views at cluster level to improve performance 🙂

Re: Using a cluster of type SINGLE_USER to run parallel python tasks in one job

Raman_Unifeye — Mon, 24 Nov 2025 11:49:56 GMT

@oye - The variables scope is local to the individual task and do no interfere with other tasks even if the underlying cluster is same. In fact, the issue is normally other way round where if we have to share the variable across tasks - Then the solutions mentioned by @Coffee77 - Global Temp view or Cluster-scoped env vars.