โ11-07-2022 01:53 PM
I have two notebooks created for my Delta Live Table pipeline. The first is a utils notebook with functions I will be reusing for other pipelines. The second contains my actual creation of the delta live tables. I added both notebooks to the pipeline settings
But the pipeline fails with the error 'Failed to execute python command for notebook' pointing to the function I created in my utils notebook. Alternatively, I attempted to use a %run magic command to force the utils notebook to run first, but it did not work. I was given the warning that magic commands are not supported. Is there any way to force the Delta Live Table Pipeline to load my utils notebook first so that its functions can be referenced while building the pipeline?
โ11-08-2022 11:54 AM
Hi @Dave Wilsonโ %run or dbutils is not supported in DLT. This is intentionally disabled because DLT is declarative and we cannot perform data movement on our own.
To answer your first query, there is unfortunately no option to make the utils notebook run first. The only option is to combine utils and your main notebooks together. This does not address the reusability aspect in DLT and we have raised this feature request with the product team. The engineering team is working internally to address this issue. We can soon expect a feature that would address this usecase.
Thanks.
โ11-08-2022 06:43 AM
Per another question we are unable to use either magic commands or dbutils.notebook.run with the pro level databricks account or Delta Live Tables. Are there any other solutions for utilizing generic functions from other notebooks within a Delta Live Table pipeline?
โ11-08-2022 11:54 AM
Hi @Dave Wilsonโ %run or dbutils is not supported in DLT. This is intentionally disabled because DLT is declarative and we cannot perform data movement on our own.
To answer your first query, there is unfortunately no option to make the utils notebook run first. The only option is to combine utils and your main notebooks together. This does not address the reusability aspect in DLT and we have raised this feature request with the product team. The engineering team is working internally to address this issue. We can soon expect a feature that would address this usecase.
Thanks.
โ01-23-2023 06:36 AM
Hi @Vivian Wilfredโ and @Dave Wilsonโ we solved our reusability code with repos and pointing the code to our main code:
sys.path.append(os.path.abspath('/Workspace/Repos/[your repo]/[folder with the python scripts'))
from your_class import *
It just works if your reusable code is in python. Also depending on what you want to do we noticed that DLT is always executed as the last piece of the code no matter what is the position in the script
โ05-02-2023 07:43 AM
Could you help me with explain this in detail?
Lets i have notebook abc which is reusable and pqr is the one i will be mentioning in dlt pipeline.
how do i call functions from abc pipeline in pqr?
โ06-15-2024 10:35 AM
I changed in my notebooks the magic commands using the sys and os library but when I run the code in a cluster it works correctly, but when I do it from the delta live table pipeline it does not work, when I try to see the current directory data it is something different, what additional configuration should I do?
โ08-26-2024 01:13 PM - edited โ08-26-2024 01:29 PM
Hi Dave,
You can solve this by putting your utils into a python file and referencing your .py file in the DLT notebook. I provided a template for the python file below:
STEP 1:
#import functions
from pyspark.sql import SparkSession
import IPython
dbutils = IPython.get_ipython().user_ns["dbutils"]
spark = SparkSession.builder.getOrCreate()
def myfunc1():
test = 1
STEP 2: You will need to create a __init__.py file in the same directory your utils.py file lives.
STEP 3:
In your DLT notebook, you'll need to append your sys path and then import your utils file as a library.
# set path
import sys
sys.path.append("/Workspace/utils_folder")
# import libraries
import dlt
import my_utils
I suggest to avoid naming your package with existing packages, e,g; pandas as a file name. I also suggest you put your utils file in a separate path from all your other files. This will make appending your path less risky.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group