Sync table A to table B, triggered by any change in table A.

Yarden — Mon, 07 Oct 2024 11:48:15 GMT

Hey,

I'm trying to find a way to sync table A to table B whenever table A is written to. just with a trigger on write.
I want to avoid using any continuous runs or schedules.
Trying to get this to work inside Databricks, without having to use any outside listeners/triggers.

I tried looking into using a workflow that's triggered by new files in a volume - but I couldn't create a volume that is on the same location of the table I want to monitor.

What is the way to achieve this?

Thanks,
Yarden.

Re: Sync table A to table B, triggered by any change in table A.

Panda — Mon, 14 Oct 2024 23:23:33 GMT

@Yarden

For this use case, Databricks does not have built-in triggers directly tied to Delta table write operations, as seen in traditional databases. However, you can achieve this functionality using one of the following approaches:

Approach 1: File Arrival Triggers (Databricks Workflows)

You can configure a Databricks workflow to trigger based on file arrivals in a directory. If table A writes files to a specific location (e.g., S3 or ADLS), a workflow can be set up to trigger whenever new files are detected.

Steps:

Configure the workflow to monitor the location where table A’s data is written.
Set the workflow to trigger the syncing logic that updates table B.
For more details, review the relevant documentation

Approach 2: Databricks Autoloader with Directory Listing

Another option is to use Autoloader on the directory where table A's data is written. When new files are added to the directory, Autoloader can trigger a job to sync the data from table A to table B.

topic Sync table A to table B, triggered by any change in table A. in Data Engineering

Sync table A to table B, triggered by any change in table A.

Re: Sync table A to table B, triggered by any change in table A.

Approach 1: File Arrival Triggers (Databricks Workflows)

Approach 2: Databricks Autoloader with Directory Listing