cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Execute Pyspark cells concurrently

Phani1
Valued Contributor

Hi Team,

Hi Team,

Is it feasible to run pyspark cells concurrently in databricks notebooks? If so, kindly provide instructions on how to accomplish this. We aim to execute the intermediate steps simultaneously.

The given scenario entails the simultaneous execution of several PySpark cells based on a condition.

 

Regards,

Janga

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @Phani1, You can run PySpark cells concurrently in Databricks Notebooks.

To achieve this, consider the following approaches:

  1. Using dbutils.notebook.run():

    • The simplest way is to utilize the dbutils.notebook.run() utility. You can call it from a notebook cell to execute another notebook. If you call it multiple times from the same cell, it will run concurrently.
    • Example usage:
      dbutils.notebook.run("/path/to/another_notebook", timeout_seconds=60, arguments={"arg1": "value1", "arg2": "value2"})
      
    • Replace /path/to/another_notebook with the actual path of the notebook you want to run concurrently. Adjust the arguments as needed.
  2. Running Multiple Notebooks Simultaneously:

Remember to adapt these methods to your specific use case, and ensure that the intermediate steps execute simultaneously based on your condition.

Happy PySpark coding! ๐Ÿ˜Š๐Ÿš€