Generate and export dbt documentation from the Workflow dbt task to S3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-02-2022 06:07 AM
I'm testing the Databricks Jobs feature with a dbt task and wanted to know if you had any advice for me for managing dbt documentation.
I can use "dbt run" commands to run my models then "dbt docs generate" to generate the documentation. But is it possible to export the generated files to GitHub or to a File System like AWS S3 ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2022 05:14 AM
Hi,
Thanks for your answer. Actually, I used this documentation for generating my Databricks jobs but there is no mention about how to manage the dbt generated documentation. I do not know if that feature is already implemented by Databricks or if there is a work around.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-08-2022 04:05 PM
Hi @Kaniz Fatma,
The documentation mentions:
- Automatic archiving of the artifacts from job runs, including logs, results, manifests, and configuration.
When a dbt task runs, do the logs, manifests and index.html automatically go back to the attached repo?
Is there a way to run slim ci with the dbt task? Can we use pre-commit? It would be good to be able to inspect manifest, capture models that have changed, shallow clone them, test their transforms, and if succeed, run those to prod.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-28-2022 06:29 PM
You can use the jobs api and hit the jobs/runs/get-output/ and look at the dbt_output.artifacts_link to get an http link to download a tar.gz file that has all the artifacts in it. You can then unpack the tar.gz and store those files in adls or s3.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-12-2024 02:49 AM
How can I access these target files from the task itself ? I am trying to use dbt's state modifiers for detecting models that changed and only running models when the source freshness changed. Is there an easy way to store and use these state files in s3/databricks workspace? We are also using Databricks Asset bundles to deploy our workflows and code, so maybe theres a way to use it for this problem ?