cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to dynamically have the parent notebook call on a child notebook?

alau131
New Contributor

Hi! I would please like help on how to dynamically call one notebook from another in Databricks and have the parent notebook get the dataframe results from the child notebook. Some background info is that I have a main python notebook and multiple SQL notebooks. The python notebook needs to call on one of the SQL notebooks via a variable for the SQL notebook name and the SQL notebook should return a dataframe to the python notebook. So I need my python notebook to dynamically change the file path name of whichever SQL notebook that I want to call on and this is the part that I am stuck on. Here's what I tried:

  • Using %run command doesn't allow variables in the filepath name so I'm unable to dynamically call on the SQL notebooks
  • Using dbutils.notebook.run() allows variables in the filepath name but I don't know how to return the dataframe results from the SQL notebook to the parent python notebook

What would be the best way to accomplish what I'm looking for? Thank you so much for any input!

2 REPLIES 2

loui_wentzel
New Contributor III

If you need to pass variables, you would indeed need to use dbutils.notebook.run() instead of %run.

As far as I'm aware, you can't rerun things between notebook executions. If you need to return a dataframe,  the easiest solution seems to create this as a table within unity catalog, and fetch it later when you need it?

Databricks is not really ment for passing dataframes within notebooks - more create a notebook that runs setup, functions, filtering etc. and create a table, which you can then easily import and use later. That way you also get easier tracability, clarity and seperation of code, which is easier to maintain in the long run.

jameshughes
New Contributor III

What you are looking to do is really not the intent of notebooks and you cannot pass complex data types between notebooks. You would need to persist your data frame from the child notebook so your parent notebook could retrieve the results after the child notebook completes.  This is inline with what @loui_wentzel recommended.

The better pattern here would be to take the logic in each of your child notebooks and create a function for each in a Python library that you could call from your main notebook.  After you have your function library, create a (.whl) file, install on the cluster, import the library into your main notebook and make the appropriate function call based upon your business requirements.

 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now