We've all been there. You need to loop through a list of tables, apply some transformation, handle a few edge cases, and maybe catch an error or two. In most data platforms, that means switching to Python or Scalaโeven if 90% of your logic is just SQL anyway.
SQL Scripting solves this exact problem, and it's a genuine productivity boost for SQL-first engineers. I've been using it for admin tasks and ELT jobs, and I'm not going back.
Prerequisites
- Databricks workspace (any cloud)
- A SQL Warehouse or cluster with DBR 14.3+
- Basic SQL knowledge (you don't need to be a PL/SQL veteran)
What SQL Scripting Gives You
SQL Scripting follows the ANSI SQL/PSM standard, which means the skills transfer if you've ever worked with T-SQL, PL/pgSQL, or similar. Here's the quick rundown:
- Local variables with DECLARE
- Control flow: IF/THEN/ELSE, CASE statements
- Four loop types: FOR, WHILE, REPEAT, and plain LOOP
- Dynamic SQL via EXECUTE IMMEDIATE
- Exception handling with condition handlers
Everything runs inside a BEGIN...END compound statement.
A Real Example: Bulk-Updating Column Collations
Say you've inherited a schema where every STRING column uses default collation, and now you need case-insensitive sorting across the board. Manually altering 50 tables? No thanks.
Here's how it can be done with SQL Scripting: ( sample code - not tested).
BEGIN
DECLARE v_table_name STRING DEFAULT 'employees';
DECLARE v_collation STRING DEFAULT 'UTF8_LCASE';
DECLARE v_columns ARRAY<STRING> DEFAULT ARRAY();
EXECUTE IMMEDIATE
'ALTER TABLE ' || v_table_name || ' SET DEFAULT COLLATION ' || v_collation;
FOR col IN (
SELECT column_name
FROM information_schema.columns
WHERE table_name = v_table_name
AND data_type = 'STRING'
) DO
EXECUTE IMMEDIATE
'ALTER TABLE ' || v_table_name ||
' ALTER COLUMN ' || col.column_name ||
' SET COLLATION ' || v_collation;
SET v_columns = array_append(v_columns, col.column_name);
END FOR;
IF size(v_columns) > 0 THEN
EXECUTE IMMEDIATE
'ANALYZE TABLE ' || v_table_name ||
' COMPUTE STATISTICS FOR COLUMNS ' ||
array_join(v_columns, ', ');
END IF;
END;
The FOR loop queries information_schema.columns and iterates row by row. EXECUTE IMMEDIATE handles the dynamic DDL. Simple.
Scaling to an Entire Schema
Want to hit every table in a schema? Nest another loop:
BEGIN
DECLARE v_schema STRING DEFAULT 'my_schema';
DECLARE v_collation STRING DEFAULT 'UTF8_LCASE';
DECLARE v_skipped ARRAY<STRING> DEFAULT ARRAY();
FOR tbl IN (
SELECT table_name
FROM information_schema.tables
WHERE table_schema = v_schema AND table_type = 'BASE TABLE'
) DO
BEGIN
DECLARE v_columns ARRAY<STRING> DEFAULT ARRAY();
DECLARE CONTINUE HANDLER FOR SQLEXCEPTION
SET v_skipped = array_append(v_skipped, tbl.table_name);
EXECUTE IMMEDIATE
'ALTER TABLE ' || v_schema || '.' || tbl.table_name ||
' SET DEFAULT COLLATION ' || v_collation;
FOR col IN (
SELECT column_name
FROM information_schema.columns
WHERE table_schema = v_schema
AND table_name = tbl.table_name
AND data_type = 'STRING'
) DO
EXECUTE IMMEDIATE
'ALTER TABLE ' || v_schema || '.' || tbl.table_name ||
' ALTER COLUMN ' || col.column_name ||
' SET COLLATION ' || v_collation;
SET v_columns = array_append(v_columns, col.column_name);
END FOR;
END;
END FOR;
SELECT v_skipped AS skipped_objects;
END;
The DECLARE CONTINUE HANDLER FOR SQLEXCEPTION catches any errors (like trying to alter a view), logs the object name, and moves on. No more script crashes halfway through.
Loop Types Cheat Sheet
| Loop |
When to Use |
| FOR query DO...END FOR |
Iterating over query results (most common) |
| WHILE cond DO...END WHILE |
Pre-condition check, might run zero times |
| REPEAT...UNTIL cond END REPEAT |
Always runs at least once |
| LOOP...END LOOP |
Infinite loopโbreak with LEAVE |
Use LEAVE to exit a loop early and ITERATE to skip to the next iteration (like continue in Python).
Gotchas & Pro Tips
- Variables must be declared first. All DECLARE statements come before any logic in a BEGIN...END block.
- FOR loops don't create a new scope. If you need a fresh variable per iteration, wrap the body in a nested BEGIN...END.
- Combine with AI functions. You can call ai_fix_grammar() or ai_classify() inside loops for bulk data cleansing. Just watch the API costs.
Wrap Up
SQL Scripting isn't trying to replace Python for complex ML pipelines. But for admin automation, schema migrations, and ELT logic? It's exactly what SQL-first engineers have been waiting for.
Give it a shot on your next bulk update job. You might be surprised how much you can do without switching languages.