cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Running dbt in a Databricks task, the dbt_output from the Databricks jobs api is empty

augustsc
New Contributor II

I'm running a scheduled workflow with a dbt task on Azure Databricks. We want to export the dbt-output from the dbt task to a storage container for our Slim CI setup and data observability.

The issue is, that the Databricks API (/api/2.1/jobs/runs/get-output) that should return a dbt_output-object with a temporary link to access the compressed artifacts from the run, always is empty while returning logs, metadata objects etc as expected. This makes little sense, because the log (stdout) from the run clearly states that the file has been created and exported to the job-result in the manged storage. The compressed file exists, but I have no way of accessing it. 

I have tried both with rest API, CLI and python SDK, with my personal token as well as with the service principal that runs the workflow. I have also created a new workflow, all with the same results: dbt_output is not there.

Would greatly appreciate any pointers to what I'm missing!

6 REPLIES 6

augustsc
New Contributor II

Hi, thank you for your reply!

Is there a way to check if dbt-core/databricks is compatible with the Databricks runtime?

We are currently using dbt-core and dbt-databricks 1.7.3 with Databricks runtime 14.2. Sill everything runs as expected, but dbt_output is empty. Should we perhaps downgrade? I can se from the note in the documentation that dbt-version >= 1.6.0 is recommended.

AndrewT_dotDBC
New Contributor II

There is the job run ID and then the Task Run ID which will return the task specific outputs like DBT data. Can you confirm if you are using the Task Run ID vs the Job Run ID? 

AndrewT_dotDBC_0-1710336050941.png

 

Thank you for your reply. Yes, we are using the Task run ID, not the Job or Job run. I am able to get the output from the task, it is just that the dbt_output part is always empty.

d_strahl
New Contributor II

We're having the same issue, I get the output from a task but not the dbt_output. 
We're running 13.3 LTS and dbt 1.7.11

d_strahl
New Contributor II

My bad, read the actual documentation now 🙂 The output is only valid för 30 minutes after a run. 
I looked at old jobs before, looking at the most recent jobrun helped. 

data-enthu
New Contributor II

I am running dbt on a databricks job. It saves all documentation: manifest.json, run_results.json, etc in "Download Artifacts" in a job. I am not able to find out a way to read those in codes, transform and save on databricks. 

Tried job API. The artifacts API: api/2.1/jobs/runs/get-output does return a link to read or download these artifacts, but it requires task ID as an input. No other API on databricks gives Task ID as output. There is no complete link for me to get these artifacts today.

Can someone pl help? 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group