We are working on automation of our databricks ingestion.
We want to make our python scrips or notebooks such that they work on both databricks jobs and dlt pipelines.
When i say databricks jobs it means normal run without dlt pipeline.
How shall we work on it. Any resources or ideas.
Some of my thoughts
- wrt tables there are only two types of tables in dlt either streaming or materialized views so if we make sure that in normal conditions as we create same tables it can work out.
- similarly with some if else block so that we can use single script or notebook with dynamic handling
or shall we create seperate notebooks for both.
How shall we run the databricks jobs
- separate dlt pipeline runs or
- inside databricks jobs by providing the dlt pipeline id.