Databricks Community

balajij8 · ‎03-20-2026

Combining SIGNAL statement with ATOMIC transactions in Databricks saves us from managing commits & rollbacks along with managing custom validations seamlessly - something that modern big data ETL frameworks struggle to deliver cleanly. They give the ability to raise meaningful exceptions from the pipeline logic and commit/roll back is managed automatically. You can validate data quality via business rule validations and take actions via custom SIGNALS.

SIGNAL is a SQL statement that allows us to deliberately raise user defined exceptions from within the code logic. It’s a guard rail inside the SQL pipeline code to keep the tables healthy. The SQL STATE code provided in the code 45000 is returned to the calling tool, and MESSAGE TEXT allows embedding dynamic values - balances, states, account IDs directly into the error message for taking easy actions.

We shall look at three cases - Financial Transaction Enforcement, ETL staging validation and Data Quality checks showing how to wire it up.

Case 1 – Financial Guard

When a debit is applied to an account in a bank, a hard rule needs to be enforced (balance must not fall below zero) for customer transactions. With SIGNAL, you can embed the logic directly where the change happens instead of handling it via custom code in various notebooks.

BEGIN ATOMIC

DECLARE v_remaining_balance INT;

UPDATE accounts SET balance = balance - 1100 WHERE account_id = 101;

SET v_remaining_balance = (SELECT balance FROM accounts WHERE account_id = 101);

-- Cancel if the balance is below zero after deduction
IF v_remaining_balance < 0 THEN
    SIGNAL SQLSTATE '45042' SET MESSAGE_TEXT = CONCAT('Insufficient balance to fulfil for account 101 ','Requested: 1100. ', 'Balance would fall to: ', CAST(v_remaining_balance AS STRING), ' units. ','Add balance before retrying.');
END IF;
END

Case 2 – ETL Staging Validation

Data pipelines that load data from raw into staging tables face a tricky class of failure - job runs cleanly but expected data has not arrived. Without an explicit check, this results in a silent operational blind spot that might not surface until a downstream report shows a dip. With Transactions & SIGNAL, the pipeline that loads data into staging, validates expected row count and merges to the core table only if the expected data is available and the entire code block rolls back atomically in case of gaps.

BEGIN ATOMIC

  -- Clear old staging data
  DELETE FROM staging_accounts;

  -- Load new data
  INSERT INTO staging_accounts SELECT * FROM raw_accounts WHERE load_date = current_date() - INTERVAL 1 DAY;

  -- Validate count & fail transaction if no data
  IF (SELECT COUNT(*) FROM staging_accounts) = 0 THEN
    SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'No accounts data loaded for yesterday';
  END IF;

  -- Merge into core
  MERGE INTO accounts AS target
  USING staging_accounts AS source
  ON target.account_id = source.account_id
  WHEN MATCHED THEN UPDATE SET *
  WHEN NOT MATCHED THEN INSERT *;

END

Case 3 - ETL – Data Quality Validation

Data Quality checks are necessary to ensure dashboards display trustworthy data. A pipeline may load 50000 rows into a table with 5000 empty or invalid account id corrupting the table & dashboards. This case extends the staging approach with a data quality validation. After loading, we validate the rows for a business rule (invalid account IDs) and block the insertion if invalid data is found. The SIGNAL fires with a clear description of the business quality failure and the core table stays clean because of the ATOMIC transactions.

BEGIN ATOMIC
  -- Clear old staging data
  DELETE FROM staging_accounts;

  -- Load new data
  INSERT INTO staging_accounts SELECT * FROM raw_accounts WHERE load_date = current_date() - INTERVAL 1 DAY;

  -- Validate data quality 
  IF (SELECT COUNT(*) FROM staging_accounts where account_id = ‘’ ) > 0 THEN
    SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'Invalid accounts loaded';
  END IF;

  -- Merge into core
  MERGE INTO accounts AS target
  USING staging_accounts AS source
  ON target.account_id = source.account_id
  WHEN MATCHED THEN UPDATE SET *
  WHEN NOT MATCHED THEN INSERT *;
END

Databricks Community

Databricks Transactions - Enforcing Business Validation Rules via SQL SIGNAL

The Next Wave of Enterprise AI | Webinar

🌟 Community Pulse: Your Weekly Roundup! June 29 – July 05, 2026

📌‌ Complete Your Profile – Help Others Get to Know You

Solution Accelerator Series | Identify Fraud With Geospatial Analytics and AI

Databricks Community Champion - June 2026 - Amira Bedhiafi