Databricks Community

maarko · ‎06-20-2025

I'm seeing non-deterministic behavior when running the same query in SQL Warehouse (Photon) vs. interactive/job clusters (non-Photon), specifically involving a LEFT OUTER JOIN and a DECIMAL comparison in a WHERE clause.

I have two views:

View A: contains COL1 → DECIMAL(21,5)
View B: contains COL2 → DECIMAL(38,6)

I’m joining them on two keys and comparing the decimal values in a filter:

sql

SELECT ... FROM A LEFT OUTER JOIN B ON A.key1 = B.key1 AND A.key2 = B.key2 WHERE A.COL1 != B.COL2

Unexpected Behavior

When I run this in SQL Warehouse (Photon): I get a consistent and correct result.
When I run the exact same query in a non-Photon interactive or job cluster:
- I get inconsistent results across runs
- The output row count changes on each run
- When I narrow the filter to a specific key, e.g., EAN_CODE = '1234567890', the result is correct — but on the full dataset, results are wrong or unstable

What I’ve Tried

Casting both columns to the same type:
sql

WHERE CAST(A.COL1 AS DECIMAL(38,6)) != CAST(B.COL2 AS DECIMAL(38,6))
→ Still inconsistent
Comparing as strings:
sql

WHERE CAST(A.COL1 AS STRING) != CAST(B.COL2 AS STRING)
→ Still inconsistent
Checked intermediate output (before WHERE filter):
- Joins return expected row count consistently
- The issue starts only when COL1 != COL2 filter is applied
Tried limiting to one EAN_CODE:
- Correct behavior with expected results
- Problem only occurs with full data volume

lingareddy_Alva · ‎06-22-2025

Hi @maarko

This is a fascinating issue that points to several potential causes related to differences between
Photon and standard Spark execution engines, particularly around decimal handling and parallelism.

Root Causes
1. Decimal Precision and Scale Handling
The different decimal types (21,5) vs (38,6) can cause implicit conversions that behave differently between Photon and standard Spark. Even after explicit casting, the underlying comparison logic may differ.
2. Parallel Execution Non-Determinism
Non-Photon clusters use standard Spark's parallel execution, which can introduce non-deterministic behavior when:
- Hash-based operations encounter hash collisions
- Floating-point arithmetic operations are reordered
- Memory pressure causes different execution paths
3. NULL Handling in LEFT OUTER JOIN
With LEFT OUTER JOIN, B.COL2 can be NULL for unmatched rows.
The comparison A.COL1 != B.COL2 behaves differently when NULLs are involved, and this handling might differ between engines.

Recommended Solutions
Solution 1: Explicit NULL Handling
Solution 2: Use COALESCE for Consistent Comparison
Solution 3: Deterministic Ordering
Solution 4: Repartition for Consistency

LR