- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-27-2024 08:54 AM
I have a large Notebook and want to divide it into multiple Notebooks and use Databricks jobs to run parallelly. However, one of the notebook is using a dataframe from one of the notebooks, so it has to be run downstream of the other ones. Now, since it is using the values from previous dataframe, how can I use the dataframe from the upstream Notebooks without writing it to a file or Unity Catalog. Thanks.
- Labels:
-
Workflows
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-28-2024 03:45 AM
Hi @Amodak91, you could use the %run magic command from within the downstream notebook and call the upstream notebook thus having it run in the same context and have all it's variables accessible including the dataframe without needing to persist it.
Alternatively you could register a temp view on the dataframe and call the downstream notebook using the dbutils.notebook.run() method.
You could read more about this here and decide what suits better.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-28-2024 03:45 AM
Hi @Amodak91, you could use the %run magic command from within the downstream notebook and call the upstream notebook thus having it run in the same context and have all it's variables accessible including the dataframe without needing to persist it.
Alternatively you could register a temp view on the dataframe and call the downstream notebook using the dbutils.notebook.run() method.
You could read more about this here and decide what suits better.

