In Part 1, we covered why multi-table transactions matter. Now let's build one.
We'll create the tables from the claim wrap-up scenario, load sample P&C insurance data, and walk through what happens when the wrap-up succeeds, when it fails, and when two processes try to wrap the same claim at the same time. Everything runs on a Databricks SQL warehouse. You can follow along in a notebook or the SQL editor.
Setting up the demo
First, we need a catalog and schema. We'll also enable catalog-managed commits on every table we create, since that's what makes multi-table transactions possible.
CREATE CATALOG IF NOT EXISTS insurance_demo;
USE CATALOG insurance_demo;
CREATE SCHEMA IF NOT EXISTS claims_dw;
USE SCHEMA claims_dw;
Now let's create our four tables. The policies table is our reference data. It holds policy details and never gets written to during the wrap-up. The other three are where the action happens.
-- Reference table: policies (read-only during wrap-up)
CREATE OR REPLACE TABLE policies (
policy_id STRING,
customer_name STRING,
section STRING,
effective_date DATE,
expiry_date DATE
)
USING DELTA
TBLPROPERTIES ('delta.feature.catalogManaged' = 'supported');
-- Claims table: gets updated with policy FK and wrap status
CREATE OR REPLACE TABLE claims (
claim_id STRING,
policy_number STRING,
section STRING,
loss_date DATE,
claim_amount DECIMAL(12,2),
status STRING,
policy_id STRING,
wrap_date DATE
)
USING DELTA
TBLPROPERTIES ('delta.feature.catalogManaged' = 'supported');
-- Wrap log: audit trail of successful wraps
CREATE OR REPLACE TABLE claim_wrap_log (
claim_id STRING,
policy_id STRING,
wrap_date DATE,
wrap_method STRING
)
USING DELTA
TBLPROPERTIES ('delta.feature.catalogManaged' = 'supported');
-- Claim reserves: initial reserve posted when a claim wraps
CREATE OR REPLACE TABLE claim_reserves (
claim_id STRING,
policy_id STRING,
section STRING,
reserve_amount DECIMAL(12,2),
reserve_date DATE
)
USING DELTA
TBLPROPERTIES ('delta.feature.catalogManaged' = 'supported');
Now let's seed some data. We'll add a mix of property and motor policies, and a batch of claims that arrived from the claims system.
-- Seed policies
INSERT INTO policies VALUES
('POL-001', 'James Wilson', 'buildings', '2025-01-01', '2026-01-01'),
('POL-001', 'James Wilson', 'contents', '2025-01-01', '2026-01-01'),
('POL-002', 'Sarah Chen', 'accidental damage', '2025-06-01', '2026-06-01'),
('POL-002', 'Sarah Chen', 'third party injury', '2025-06-01', '2026-06-01'),
('POL-003', 'David Okafor', 'buildings', '2024-03-01', '2025-03-01'),
('POL-004', 'Emily Brown', 'accidental damage', '2025-09-01', '2026-09-01');
-- Seed claims (arrived from claims system, all status = 'open')
INSERT INTO claims (claim_id, policy_number, section, loss_date, claim_amount, status) VALUES
('CLM-101', 'POL-001', 'buildings', '2025-08-15', 12000.00, 'open'),
('CLM-102', 'POL-002', 'accidental damage', '2025-11-20', 4500.00, 'open'),
('CLM-103', 'POL-003', 'buildings', '2025-06-10', 8000.00, 'open'),
('CLM-104', 'POL-002', 'windscreen', '2025-12-01', 3200.00, 'open'),
('CLM-105', 'POL-999', 'buildings', '2025-07-01', 5500.00, 'open');
Take a moment to look at the claims. Some will wrap cleanly, some won't:
- CLM-101: Property buildings claim on POL-001. Loss date is within the policy period and "buildings" is a valid section. This should wrap.
- CLM-102: Motor accidental damage claim on POL-002. Loss date is within the policy period and the section matches. This should wrap too.
- CLM-103: Property buildings claim on POL-003. The policy expired on 2025-03-01, but the loss date is 2025-06-10. Outside the policy period. This should fail.
- CLM-104: Claim on POL-002, but the section is "windscreen". POL-002 only has "accidental damage" and "third party injury" sections. There's no "windscreen" section on this policy. This should fail.
- CLM-105: Says it belongs to POL-999, which doesn't exist at all. This should fail.
Scenario 1: Successful claim wrap-up
Let's wrap CLM-101. We'll use a non-interactive transaction with BEGIN ATOMIC...END and validate the business rules with IF and SIGNAL.
BEGIN ATOMIC
-- Rule 1: Does the policy exist?
IF NOT EXISTS (
SELECT 1 FROM policies WHERE policy_id = 'POL-001'
) THEN
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'Policy not found';
END IF;
-- Rule 2: Is the loss date within the policy period?
IF NOT EXISTS (
SELECT 1 FROM policies p
JOIN claims c ON c.policy_number = p.policy_id
WHERE c.claim_id = 'CLM-101'
AND p.policy_id = 'POL-001'
AND p.section = c.section
AND c.loss_date BETWEEN p.effective_date AND p.expiry_date
) THEN
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'Loss date falls outside policy period or section mismatch';
END IF;
-- All rules passed. Update the claim with policy FK and wrap status.
UPDATE claims
SET status = 'wrapped',
wrap_date = current_date(),
policy_id = 'POL-001'
WHERE claim_id = 'CLM-101';
-- Log the wrap event
INSERT INTO claim_wrap_log
VALUES ('CLM-101', 'POL-001', current_date(), 'auto');
-- Post the initial reserve
INSERT INTO claim_reserves
VALUES ('CLM-101', 'POL-001', 'buildings', 12000.00, current_date());
END;
Both rules pass, so the claim gets updated, the wrap log gets an entry, and the reserve gets posted. One transaction, one commit, three tables updated atomically.
You can verify:
SELECT * FROM claims WHERE claim_id = 'CLM-101';
SELECT * FROM claim_wrap_log WHERE claim_id = 'CLM-101';
SELECT * FROM claim_reserves WHERE claim_id = 'CLM-101';
CLM-101 wrapped: status updated and policy FK attached
CLM-101 wrap log entry with auto method
CLM-101 initial reserve posted for the buildings section
Scenario 2: Failed wrap-up with automatic rollback
Now let's try CLM-103. Remember, the loss date (2025-06-10) falls after the policy expiry (2025-03-01).
BEGIN ATOMIC
-- Rule 1: Does the policy exist?
IF NOT EXISTS (
SELECT 1 FROM policies WHERE policy_id = 'POL-003'
) THEN
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'Policy not found';
END IF;
-- Rule 2: Loss date within policy period and section match?
IF NOT EXISTS (
SELECT 1 FROM policies p
JOIN claims c ON c.policy_number = p.policy_id
WHERE c.claim_id = 'CLM-103'
AND p.policy_id = 'POL-003'
AND p.section = c.section
AND c.loss_date BETWEEN p.effective_date AND p.expiry_date
) THEN
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'Loss date falls outside policy period or section mismatch';
END IF;
UPDATE claims
SET status = 'wrapped',
wrap_date = current_date(),
policy_id = 'POL-003'
WHERE claim_id = 'CLM-103';
INSERT INTO claim_wrap_log
VALUES ('CLM-103', 'POL-003', current_date(), 'auto');
INSERT INTO claim_reserves
VALUES ('CLM-103', 'POL-003', 'buildings', 8000.00, current_date());
END;
This time, the SIGNAL fires because the loss date is outside the policy period. The entire transaction rolls back. Nothing gets written to any of the three tables. The claim stays as an orphan, ready to be picked up in tomorrow's run.
SIGNAL fires: loss date falls outside policy period
Verify that nothing changed:
SELECT * FROM claims WHERE claim_id = 'CLM-103';
-- status should still be 'open', policy_id should be NULL
CLM-103 unchanged: still open with no policy attached
Scenario 3: Interactive review of orphaned claims
Sometimes an adjuster needs to look at an orphaned claim by hand. Maybe the claim is legitimate but the data needs a correction before it can wrap. Interactive mode is built for this.
Let's say an adjuster is looking at CLM-104 (the claim with "windscreen" as the section, which doesn't match the policy).
BEGIN TRANSACTION;
-- Check what we're dealing with
SELECT c.claim_id, c.section AS claim_section, p.section AS policy_section
FROM claims c
JOIN policies p ON c.policy_number = p.policy_id
WHERE c.claim_id = 'CLM-104';
CLM-104 mismatch: windscreen vs policy sections
The adjuster sees the mismatch: the claim says "windscreen", but POL-002 only has "accidental damage" and "third party injury". After reviewing the original claim form, they determined it should have been "accidental damage". They correct it and wrap:
-- Fix the section and wrap
UPDATE claims
SET section = 'accidental damage',
status = 'wrapped',
wrap_date = current_date(),
policy_id = 'POL-002'
WHERE claim_id = 'CLM-104';
INSERT INTO claim_wrap_log
VALUES ('CLM-104', 'POL-002', current_date(), 'manual');
INSERT INTO claim_reserves
VALUES ('CLM-104', 'POL-002', 'accidental damage', 3200.00, current_date());
Before committing, the adjuster checks that everything looks right. This is the whole point of interactive mode: you can see the intermediate state inside the transaction before making anything permanent.
-- Verify the correction before committing
SELECT c.claim_id, c.section, c.status, c.policy_id, r.reserve_amount
FROM claims c
JOIN claim_reserves r ON c.claim_id = r.claim_id
WHERE c.claim_id = 'CLM-104';
CLM-104 corrected to accidental damage before commit
The section is corrected, the reserve is posted, and the amounts match. The adjuster commits:
COMMIT;
If something looked off, they could have run ROLLBACK instead and nothing would have been saved.
Scenario 4: Conflict detection
What happens when two processes try to wrap the same claim at the same time? Let's simulate it.
To try this yourself, you'll need two separate SQL sessions, like two SQL editor tabs on the same warehouse. The steps below alternate between them.
Imagine two wrap-up processes running at the same time. Both pick up CLM-102 (the accidental damage claim that should wrap successfully).
Process A starts first:
BEGIN TRANSACTION;
UPDATE claims
SET status = 'wrapped',
wrap_date = current_date(),
policy_id = 'POL-002'
WHERE claim_id = 'CLM-102';
INSERT INTO claim_wrap_log
VALUES ('CLM-102', 'POL-002', current_date(), 'auto');
INSERT INTO claim_reserves
VALUES ('CLM-102', 'POL-002', 'accidental damage', 4500.00, current_date());
Process B starts in a separate session before Process A commits:
BEGIN TRANSACTION;
UPDATE claims
SET status = 'wrapped',
wrap_date = current_date(),
policy_id = 'POL-002'
WHERE claim_id = 'CLM-102';
Now Process A commits:
-- Process A
COMMIT; -- succeeds
Process B tries to commit:
-- Process B
COMMIT; -- fails with a conflict error
Process B gets a conflict error because Process A already committed changes to the same rows. Databricks uses optimistic concurrency control: no locks are held during the transaction, and conflicts are detected at commit time. If two transactions touch overlapping data in the same table, the second one to commit loses and has to retry.
In most batch processing scenarios, that's exactly what you want. Conflicts are the exception, not the rule. Process B just needs to roll back, re-read the current state, and decide whether it still needs to act.
Key takeaways
A few things worth remembering:
- Non-interactive for pipelines, interactive for exploration. Use
BEGIN ATOMIC...END when you know exactly what needs to happen. Use BEGIN TRANSACTION...COMMIT when you need to inspect results before deciding.
- SIGNAL is your friend for business rules. Custom error messages make it obvious why a wrap-up failed, which beats a generic constraint violation every time.
- The policy table was read-only throughout. We joined against it to validate rules but never wrote to it. The transaction still keeps those reads consistent through snapshot isolation.
- Orphaned claims are a feature, not a bug. If the wrap fails, the data is clean. You just try again tomorrow with fresh data. The retry pattern works even better now that you know nothing was half written.
Try it yourself
If you missed it, Part 1 covers the motivation behind this feature and why it matters beyond data engineering.
The official docs are here: Multi-statement transactions on Databricks.
Try adapting this to your own domain. The claim wrap-up is just one example. Anywhere you're writing to multiple tables and need them to stay in sync, this is worth a look.
Got a question or spotted something I missed? Let me know in the comments.