01-05-2024 05:04 AM
I'm running a scheduled workflow with a dbt task on Azure Databricks. We want to export the dbt-output from the dbt task to a storage container for our Slim CI setup and data observability.
The issue is, that the Databricks API (/api/2.1/jobs/runs/get-output) that should return a dbt_output-object with a temporary link to access the compressed artifacts from the run, always is empty while returning logs, metadata objects etc as expected. This makes little sense, because the log (stdout) from the run clearly states that the file has been created and exported to the job-result in the manged storage. The compressed file exists, but I have no way of accessing it.
I have tried both with rest API, CLI and python SDK, with my personal token as well as with the service principal that runs the workflow. I have also created a new workflow, all with the same results: dbt_output is not there.
Would greatly appreciate any pointers to what I'm missing!
01-19-2024 12:56 AM
Hi, thank you for your reply!
Is there a way to check if dbt-core/databricks is compatible with the Databricks runtime?
We are currently using dbt-core and dbt-databricks 1.7.3 with Databricks runtime 14.2. Sill everything runs as expected, but dbt_output is empty. Should we perhaps downgrade? I can se from the note in the documentation that dbt-version >= 1.6.0 is recommended.
03-13-2024 06:21 AM
There is the job run ID and then the Task Run ID which will return the task specific outputs like DBT data. Can you confirm if you are using the Task Run ID vs the Job Run ID?
03-14-2024 04:28 AM
Thank you for your reply. Yes, we are using the Task run ID, not the Job or Job run. I am able to get the output from the task, it is just that the dbt_output part is always empty.
04-04-2024 11:23 PM
We're having the same issue, I get the output from a task but not the dbt_output.
We're running 13.3 LTS and dbt 1.7.11
04-04-2024 11:29 PM
My bad, read the actual documentation now 🙂 The output is only valid för 30 minutes after a run.
I looked at old jobs before, looking at the most recent jobrun helped.
08-16-2024 12:20 PM
I am running dbt on a databricks job. It saves all documentation: manifest.json, run_results.json, etc in "Download Artifacts" in a job. I am not able to find out a way to read those in codes, transform and save on databricks.
Tried job API. The artifacts API: api/2.1/jobs/runs/get-output does return a link to read or download these artifacts, but it requires task ID as an input. No other API on databricks gives Task ID as output. There is no complete link for me to get these artifacts today.
Can someone pl help?
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group