cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Dynamic Jobs community Edition

Ritesh-Dhumne
New Contributor II

Hello  I tried this,

Notebook 1 :

dbutils.jobs.taskValues.set(key = "my_key", value = "hi From Notebook1")

Notebook2:

X = dbutils.jobs.taskValues.get(taskKey="01", key="my_key", debugValue = "Fail")

print(X)

 Here I get "Fail" as output, its not fetching my_key

 

7 REPLIES 7

saurabh18cs
Honored Contributor II

Hi @Ritesh-Dhumne f you run both notebooks manually or outside a Databricks job with multiple tasks, taskValues will not work as expected. you should define your job with multiple tasks in Databricks Workflows.

Could u Provide me the code and flow.

szymon_dybczak
Esteemed Contributor III

Hi @Ritesh-Dhumne ,

Folllow my steps. I created 2 notebooks:

- first one called Notebook1 with followign content

szymon_dybczak_0-1759737959023.png

- second one called Notebook2 with following content that will read value defined in Notebook1

szymon_dybczak_1-1759738000891.png

 

Here's my definiton of workflow that is using those 2 notebooks:

szymon_dybczak_2-1759738044356.png

Pay attention to my taskKey in get method is named the same as the task in a workflow (Notebook1):

szymon_dybczak_4-1759738096570.png

 

szymon_dybczak_3-1759738090797.png

When I ran workflow it works as expected:

szymon_dybczak_5-1759738194034.png

 

 

)

 

hi @szymon_dybczak , Could u help me in a scenario like I'm trying to build a pipeline where notebook1 captures the file name and format of the file in the catalog . Notebook2 will take the filename and format from notebook1 and perform Basic transformations.

szymon_dybczak
Esteemed Contributor III

Hi @Ritesh-Dhumne ,

Sure, but could you describe what you need help with? What's the problem? ๐Ÿ™‚

I wanted to extract all files in the volume I have uploaded , in notebook 1 and then in notebook 2 perform basic transformation , also I want to store the null , dirty records seperately and a clean dataframe seperately for all the files .In Community Edition

szymon_dybczak
Esteemed Contributor III

Hi @Ritesh-Dhumne ,

I'm assuming that you mistakenly named Free Edition as Community since you're using volumes.

Iโ€™m not sure if Iโ€™ve understood your approach correctly, but at first glance it seems incorrect - you canโ€™t pass a DataFrame between tasks. What you can do is load all the files from the volume into a bronze table in Notebook1. You can use the special _metadata column to add information about the file_path from which each particular row originates. Hereโ€™s an example of how to use it:

szymon_dybczak_0-1759826052216.png

 

 

Then, in Notebook2, you can apply your transformations based on this bronze table. You can count nulls, handle dirty data, and benefit from the fact that you can relate all these issues to a particular file, since this information is added to the bronze table through the _metadata special column.

From what I see you're in a learning process so I won't introduce the concept of autoloader which is pretty handy for ingestion of files ๐Ÿ™‚

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now