cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

refresh online table: How to get update_id and check status of a specific update

lauraxyz
Contributor

Hi!

I have a workflow job to trigger a refresh of an online table. How can I get the update_id with this specific refresh?

Also, is it possible to get the status from this specific update_id?

Thanks!

6 REPLIES 6

Walter_C
Databricks Employee
Databricks Employee

To get the update_id for a specific refresh of an online table and retrieve its status, you can follow these steps:

Obtaining the update_id

When you trigger a refresh of an online table through a workflow job, you can typically access the update_id in one of two ways:

  1. Workflow output: Many workflow systems provide a way to capture the output or return value of the refresh operation. Check your workflow execution logs or output variables for the update_id.
  2. Query the event log: If the workflow doesn't directly provide the update_id, you can query the event log to find the most recent update for your table.
SELECT origin.update_id AS id FROM event_log_raw WHERE event_type = 'create_update' ORDER BY timestamp DESC LIMIT 1;

This query will return the update_id of the most recent update operation

Retrieving the status

Once you have the update_id, you can query the status of that specific update:

SELECT status FROM event_log_raw WHERE event_type = 'update_progress' AND origin.update_id = '<your_update_id>' ORDER BY timestamp DESC LIMIT 1;

Replace <your_update_id> with the actual update_id you obtained

Thanks!

Yes i'm able to get update_id.   For getting status, is there a way to get it with Databricks SDK, something like `w.refresh_monitors.get_refresh({update_id})`?

RiyazAli
Valued Contributor II

Hi @lauraxyz 

 

Given you have your update_id, we can retrieve the status of the DLT pipeline using the below class from Databricks SDK.

RiyazAli_0-1734947064158.png

Cheers!

Riz

lauraxyz
Contributor

Another qq: Since online table has 3 sync mode: Snapshot, Triggered, and Continuous.   

when refreshing the online table with 

w.pipelines.start_update(pipeline_id='{pipeline_id}', full_refresh=True)

 which sync mode is used by default? 

RiyazAli
Valued Contributor II

Hi @lauraxyz, the class pipelines.StartUpdate, doesn't take any argument to specify mode. I'm assuming that it will trigger in the same mode in which the pipeline is created. Kindly let me know if you find anything to contradict my understanding.

Riz

Thanks Riz! I think you are right, the sync mode was already specified when online table was created.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group