Hello Community,
I am working with a Delta Live Tables (DLT) pipeline that primarily operates in incremental mode. However, there are specific scenarios where I need to perform a full refresh of the pipeline. I am looking for an efficient and reliable way to determine, within the pipeline's Python codebase, whether it was triggered as a full refresh or a normal incremental run.
My Requirements:
- Dynamic Identification: The solution should enable the code to dynamically identify the type of run (full refresh vs. incremental).
- Pipeline Configuration: Ideally, this should be achieved by configuring something within the DLT pipeline, such as a parameter or flag.
- Accessing the Configuration: The configuration should be accessible within the Python code during execution, allowing me to assign the information to variables for downstream logic.
My Questions:
- Is there an existing way in Databricks DLT to configure and identify the type of run?
- Can the run type (full refresh vs. incremental) be passed as a parameter or stored in a metadata table that the pipeline can read?
- Are there any best practices for handling such scenarios efficiently in DLT?
Any guidance, examples, or insights from your experience would be greatly appreciated.
Thank you in advance for your support!