cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to use partial_parse.msgpack with workflow dbt task?

Jfoxyyc
Valued Contributor

I'm looking for direction on how to get the dbt task in workflows to use the partial_parse.msgpack file to skip parsing files that haven't changed. I'm downloading my artifacts after each run and the partial_parse file is being saved back to adls.

What is the dbt task doing? Which folder does the git repo specified in "project directory" load to? I'm assuming I can use an init script and load the file from adls and dump it into whichever folder is /{project directory}/target/ ?

2 REPLIES 2

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi, Could you please confirm what will be your expectation and the used case? Do you want the file to be saved somewhere else?

Jfoxyyc
Valued Contributor

The use case is I'd like the dbt task in a databricks workflow to be able to use the partial_parse.msgpack file so it can take advantage of partial parsing and not have to parse files that haven't changed. It significantly speeds up runtime since it doesn't waste time parsing.

I download the file from the artifacts after each run and store it in an adls location. I don't know how to get the dbt task to use it. I don't want to store it in git repo. I would prefer it if I could get the dbt task to copy the file from adls somehow. If this was possible, we could also copy manifest.json and run state:modified tasks for slim ci.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.