refresh online table: How to get update_id and check status of a specific update
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-18-2024 03:50 PM
Hi!
I have a workflow job to trigger a refresh of an online table. How can I get the update_id with this specific refresh?
Also, is it possible to get the status from this specific update_id?
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-18-2024 04:51 PM
To get the update_id for a specific refresh of an online table and retrieve its status, you can follow these steps:
Obtaining the update_id
When you trigger a refresh of an online table through a workflow job, you can typically access the update_id in one of two ways:
- Workflow output: Many workflow systems provide a way to capture the output or return value of the refresh operation. Check your workflow execution logs or output variables for the update_id.
- Query the event log: If the workflow doesn't directly provide the update_id, you can query the event log to find the most recent update for your table.
SELECT origin.update_id AS id
FROM event_log_raw
WHERE event_type = 'create_update'
ORDER BY timestamp DESC
LIMIT 1;
This query will return the update_id of the most recent update operation
Retrieving the status
Once you have the update_id, you can query the status of that specific update:
SELECT status
FROM event_log_raw
WHERE event_type = 'update_progress'
AND origin.update_id = '<your_update_id>'
ORDER BY timestamp DESC
LIMIT 1;
Replace <your_update_id>
with the actual update_id you obtained
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2024 03:11 PM
Thanks!
Yes i'm able to get update_id. For getting status, is there a way to get it with Databricks SDK, something like `w.refresh_monitors.get_refresh({update_id})`?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-23-2024 01:45 AM
Hi @lauraxyz
Given you have your update_id, we can retrieve the status of the DLT pipeline using the below class from Databricks SDK.
Cheers!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2024 03:20 PM
Another qq: Since online table has 3 sync mode: Snapshot, Triggered, and Continuous.
when refreshing the online table with
w.pipelines.start_update(pipeline_id='{pipeline_id}', full_refresh=True)
which sync mode is used by default?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-23-2024 01:58 AM - edited 12-23-2024 01:58 AM
Hi @lauraxyz, the class pipelines.StartUpdate, doesn't take any argument to specify mode. I'm assuming that it will trigger in the same mode in which the pipeline is created. Kindly let me know if you find anything to contradict my understanding.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2024 10:36 AM
Thanks Riz! I think you are right, the sync mode was already specified when online table was created.

