Databricks Community

159312 · ‎08-05-2022

I have a notebook used for a dlt pipeline. The pipeline should perform an extra task if the pipeline is run as a full refresh. Right now, I have to set an extra configuration parameter when I run a full refresh. Is there a way to programmatically access whether the current run is a full_refresh or a regular run?

Debayan · ‎08-08-2022

@Ben Bogart The create_update event log table can be queried to know if it is a full_refresh or a regular run.

Alternatively, in addition,

2.0/pipelines/{pipeline_id}/updates

The above API call should return the update ID.

You can then get update details for that update_id by using the below API:

2.0/pipelines/{pipeline_id}/updates/{update_id}

The JSON response body should have a field 'full_refresh: true|false'.

159312 · ‎08-10-2022

@Debayan Mukherjee I would like to automate this from within a notebook that is a part of multiple pipelines. Both solutions require knowing the `pipeline_id`. How can I access the pipeline id from within a pipeline run?

Reference Links for future readers:

Documentation for querying the log

kfoster · ‎08-30-2022

pipeline id for the run can be found using:

spark.conf.get("pipelines.id")

Vidula · ‎09-20-2022

Hi @Ben Bogart

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.

We'd love to hear from you.

Thanks!