2 weeks ago
Hello I tried this,
Notebook 1 :
dbutils.jobs.taskValues.set(key = "my_key", value = "hi From Notebook1")
Notebook2:
X = dbutils.jobs.taskValues.get(taskKey="01", key="my_key", debugValue = "Fail")
print(X)
Here I get "Fail" as output, its not fetching my_key
2 weeks ago
Hi @Ritesh-Dhumne f you run both notebooks manually or outside a Databricks job with multiple tasks, taskValues will not work as expected. you should define your job with multiple tasks in Databricks Workflows.
2 weeks ago
Could u Provide me the code and flow.
2 weeks ago
Hi @Ritesh-Dhumne ,
Folllow my steps. I created 2 notebooks:
- first one called Notebook1 with followign content
- second one called Notebook2 with following content that will read value defined in Notebook1
Here's my definiton of workflow that is using those 2 notebooks:
Pay attention to my taskKey in get method is named the same as the task in a workflow (Notebook1):
When I ran workflow it works as expected:
)
2 weeks ago
hi @szymon_dybczak , Could u help me in a scenario like I'm trying to build a pipeline where notebook1 captures the file name and format of the file in the catalog . Notebook2 will take the filename and format from notebook1 and perform Basic transformations.
2 weeks ago
Hi @Ritesh-Dhumne ,
Sure, but could you describe what you need help with? What's the problem? ๐
2 weeks ago
I wanted to extract all files in the volume I have uploaded , in notebook 1 and then in notebook 2 perform basic transformation , also I want to store the null , dirty records seperately and a clean dataframe seperately for all the files .In Community Edition
2 weeks ago
Hi @Ritesh-Dhumne ,
I'm assuming that you mistakenly named Free Edition as Community since you're using volumes.
Iโm not sure if Iโve understood your approach correctly, but at first glance it seems incorrect - you canโt pass a DataFrame between tasks. What you can do is load all the files from the volume into a bronze table in Notebook1. You can use the special _metadata column to add information about the file_path from which each particular row originates. Hereโs an example of how to use it:
Then, in Notebook2, you can apply your transformations based on this bronze table. You can count nulls, handle dirty data, and benefit from the fact that you can relate all these issues to a particular file, since this information is added to the bronze table through the _metadata special column.
From what I see you're in a learning process so I won't introduce the concept of autoloader which is pretty handy for ingestion of files ๐
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now