cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Trigger table sync from job

Malthe
Contributor

When setting up a table sync using the UI, a pipeline is created, but this is not visible through the Pipelines overviewโ€“presumably because it's "managed" by the target table (at least this is where you manage the data ingest process.

This means that if the table sync is running in triggered mode, it's not possible to trigger it using a pipeline task for jobs, which is otherwise possible (a "pipeline task can be the triggering mechanism for a triggered pipeline"), but we can't look up the pipeline.

What's the recommended approach here?

1 ACCEPTED SOLUTION

Accepted Solutions

Saritha_S
Databricks Employee
Databricks Employee

Hi @Malthe 

The recommended method is to manage and trigger the sync via table-level APIs or management interfaces, not the pipeline-level job triggers:

  • For Unity Catalog synced tables (e.g., syncing to Postgres), triggering a sync or refresh is performed by interacting with the tableโ€™s own API endpoint rather than the pipeline overview, such as using the refresh API or a manual command at the table level.

  • If using the Unity Catalog or related table management system, you can issue an API call or use the Databricks CLI to manually trigger a sync (refresh), schedule triggers, or monitor the status of the ingest process at the table level.

  • This design separates table sync pipelines from user-managed job pipelines to ensure that ingestion is strictly controlled via the target tableโ€™s management UI/API, maintaining data integrity and reliability.

If your use case demands automated or programmatic triggering (for instance, after upstream data changes), itโ€™s best practice to:

  • Use the synced tableโ€™s refresh API endpoint in your automation/orchestration tools.

  • Monitor and manage ingestion through the table sync management interface, not the pipeline jobs dashboard.

For additional details please refer to the below doc 

https://docs.databricks.com/aws/en/dev-tools/ci-cd/best-practices

View solution in original post

1 REPLY 1

Saritha_S
Databricks Employee
Databricks Employee

Hi @Malthe 

The recommended method is to manage and trigger the sync via table-level APIs or management interfaces, not the pipeline-level job triggers:

  • For Unity Catalog synced tables (e.g., syncing to Postgres), triggering a sync or refresh is performed by interacting with the tableโ€™s own API endpoint rather than the pipeline overview, such as using the refresh API or a manual command at the table level.

  • If using the Unity Catalog or related table management system, you can issue an API call or use the Databricks CLI to manually trigger a sync (refresh), schedule triggers, or monitor the status of the ingest process at the table level.

  • This design separates table sync pipelines from user-managed job pipelines to ensure that ingestion is strictly controlled via the target tableโ€™s management UI/API, maintaining data integrity and reliability.

If your use case demands automated or programmatic triggering (for instance, after upstream data changes), itโ€™s best practice to:

  • Use the synced tableโ€™s refresh API endpoint in your automation/orchestration tools.

  • Monitor and manage ingestion through the table sync management interface, not the pipeline jobs dashboard.

For additional details please refer to the below doc 

https://docs.databricks.com/aws/en/dev-tools/ci-cd/best-practices