topic Re: Inquiring whether table triggers are the recommended tool for the job in Data Engineering

Inquiring whether table triggers are the recommended tool for the job

David_Dabbs — Sat, 04 Apr 2026 18:11:57 GMT

Seeking the DBRX-appropriate patterns for our application.

We have a number of workspaces governed by the same Unity Catalog.
One workspace we'll call the 'producer'. It manages data via external custom API interfaces.
There are a number of internal consumers who want timely notification, as immediate as possible, when this central 'producer' has acquired new data, approximately monthly, without them 'polling' the "last data arrival date" control table. Upon notification, they would execute processes - likely a Notebook - that accesses & processes the producer data, and then once processing is complete, report a status back to the producer system.

Sharing and Acquisition
We intend to provision access to the producer data via permissioned Delta share. The 'producer' and several 'consumer' workspaces are governed by the same U Catalog on AWS. Our plan is for the producing system to create VIEW facade for the 'latest data' control table and the single timeseries data table and grant access to the consuming workspace automation system accounts.

Notification
Are table triggers created (on the VIEW of the "data complete control table") by the downstream consuming systems the reliable and recommended tool for initiating the consumers' processing?

Posting Processing Results back to the Producer
Feedback from consumers is trivial - an id column from each consumed row plus a smallint result status code. The consumer must idenfity itself to the producer (so producer knows which system is responding) when posting the results. The producer validates the submittted rows and the consumer should get immediate feedback when there are errors with any of them. I'm recalling Oracle procedures can accept rowset parameters, but I have not seen something like this in DBRX and can procedure callable across workspace process boundaries? We expect the consumers will have their processed 'status reports' ready in a table prior to posting. The number of consuming and reporting clients is low, so we could stomach configuring some bi-lateral arrangement - a job in the producer system that the consumer is permissioned to call/execute when ready to report. It would access a Delta shared table from that consumer's system to the producer. Surely there is more de-coupled approach.

I am new to DBRX and appreciate your suggestions in advance. Thanks,
DD

Re: Inquiring whether table triggers are the recommended tool for the job

lingareddy_Alva — Sat, 04 Apr 2026 19:24:57 GMT

Hi @David_Dabbs ,

This is a well-structured problem. Let me address each of the three concerns systematically, then recommend an overall pattern.

Notification: Delta Live Tables Triggers vs. Recommended Alternatives

Table triggers on VIEWs are not the right tool here. Databricks does not support DML triggers (in the Oracle sense) on Delta tables or views. What Databricks does have is:

- Structured Streaming with trigger(availableNow=True) — a consumer-side poll that runs a job on a schedule and processes only new data since the last checkpoint. This is still polling, just abstracted.
- Delta Live Tables — pipeline-oriented, not event-driven across workspace boundaries.
- Databricks Workflows "File Arrival" / table-based triggers — these watch for new files in cloud storage, not Delta table writes.

The recommended pattern for cross-workspace, near-real-time notification is Databricks Workflows plus a lightweight event bus. Given your Unity Catalog shared governance, there are two practical approaches:

Option A: Workflows REST API (Job Triggering)
The producer workspace, upon completing its data load, calls the Databricks REST API to trigger a job in each consumer workspace. Each consumer pre-creates a Workflow job (notebook task) and grants the producer's service principal `CAN_TRIGGER` permission on it. The producer calls `POST /api/2.1/jobs/run-now` against each consumer workspace endpoint.

- Near-immediate, no polling
- Native to Databricks, no external infrastructure
- Manageable with your low consumer count
- Producer must know each consumer's workspace URL and job ID (a small config table suffices)
- Bi-lateral trust: consumer workspaces must authorize the producer SP

Option B: External Event Bus (AWS EventBridge or SNS/SQS)
The producer publishes an event to AWS EventBridge or an SNS topic upon data load completion. Each consumer workspace has a Databricks Workflow with an HTTP webhook trigger or a Lambda-to-REST API bridge that fires the consumer job.

- Fully decoupled
- Auditable event history
- Scales to more consumers without producer config changes
- Requires AWS infrastructure outside Databricks
- More moving parts for a low-consumer scenario

For your stated scale (low consumer count, same UC governance domain), Option A is the pragmatic recommendation. The producer maintains a small config table of {consumer_workspace_url, job_id, sp_token_secret_scope} and fans out job triggers upon completion.

Delta Share Facade: Views Are the Right Call
Your plan to expose a VIEW over the control table and the timeseries table via Delta Sharing is correct and idiomatic. A few reinforcing points:
Grant access to the consumer automation service principals, not user accounts, so permissions survive personnel changes. The view facade lets you control exactly which columns and rows consumers see — useful for isolating `is_complete` or `data_as_of_date` without exposing internal control columns. On the consumer side, the shared tables are read-only by definition in Delta Sharing, which cleanly enforces the separation of concerns.

Posting Results Back: The Recommended Pattern
Databricks has no analog to Oracle's table-valued procedure parameters callable cross-workspace. The options from most to least recommended:

Recommended: Consumer-Owned Shared Table, Producer Pulls
Each consumer workspace has a results Delta table (e.g., `consumer_ws.reporting.results_outbox`) shared back to the producer via a reverse Delta Share. The producer has a lightweight Workflow job — triggered by a callback the consumer makes when done — that reads from the shared table, validates, and writes to its own `producer.governance.consumer_results` table.
The callback is simply: when the consumer job finishes, it calls `POST /api/2.1/jobs/run-now` on the producer's ingest job, passing the `consumer_id` as a job parameter. The producer job then knows which shared table to read.
- Decoupled — consumer just writes locally and signals
- No cross-workspace write permissions needed
- Producer controls validation logic centrally
- Consumer gets feedback via a status table the producer writes back to a shared acknowledgment view

A few Unity Catalog housekeeping points worth flagging as you design this:
Use service principals (not PATs tied to users) for all cross-workspace job triggers; store their OAuth credentials in Databricks Secrets. The `CAN_TRIGGER` permission on Jobs is workspace-local — you'll need to grant it per consumer workspace, but it's a one-time configuration per consumer onboarding. Delta Sharing across workspaces on the same UC metastore uses internal sharing (no external recipient overhead), which simplifies credential management significantly.

Re: Inquiring whether table triggers are the recommended tool for the job

David_Dabbs — Sat, 04 Apr 2026 20:13:45 GMT

Thank you @lingareddy_Alva. Your considered response lives up to your forum title: Esteemed Contributor.

1. Notification.
Appreciate the confirmation that the determinism and control is worth the small bit of explicit configuration given the limited set of consumers.

Regarding table triggers, did I misunderstand the capability described here: https://docs.databricks.com/aws/en/jobs/trigger-table-update?

Re. Option B, apologies for not having mentioned that we had already foreclosed on any AWS eventbus complexity. Option A is where we suspected we might end up. Regarding means to execute the cross-ws job, we assumed that since the producer will be running a notebook job it can initiate the target consumer jobs via API, e.g

# Point the client at consumer workspace... w_b = WorkspaceClient(host="https://consumer1-workspace-host", token="<token-from-secrets>") # Trigger workspace job w_b.jobs.run_now(job_id=<client1-processing-job-id>)

2. Facade
Agree - we use VIEWS like this in legacy PG to reduce coupling for schema evolution.

3. Triggering Postbacks
As my UK colleagues are fond of saying, "in for a penny, in for a Pound." Since we're already using explicit calls, it makes sense to apply that pattern on the return trip. Thanks for the confirmation.

4. Tips

The `CAN_TRIGGER` permission on Jobs is workspace-local — you'll need to grant it per consumer workspace, but it's a one-time configuration per consumer onboarding.

Appreciate the pointers. So this configures a calling workspace, irrespective of service principal, the ACL to TRIGGER, and then a specific principal is granted a calling ACL?

Thanks,
DD

Re: Inquiring whether table triggers are the recommended tool for the job

lingareddy_Alva — Sun, 05 Apr 2026 02:40:46 GMT

Thanks @David_Dabbs for the accepting as a solution.

On table triggers: you were right the feature exists, but the Delta Sharing constraint kills it for your topology — tables shared via Delta Sharing are not supported https://docs.databricks.com/aws/en/jobs/trigger-table-update as trigger sources. Explicit API call stands.

On CAN_TRIGGER: your model is correct. It's a per-job ACL granted to a specific principal — no workspace-level intermediary. Host URL is just addressing.