โ08-15-2025 03:39 AM
When defining a streaming tables using DLT (declarative pipelines), we can provide a schema which lets us define primary and foreign key constraints.
However, references to self, i.e. the defining table, are not currently allowed (you get a "table not found" error.)
Since with DLT, you're not allowed to alter tables created through the framework, there's no way to define a self-referential constraint, i.e. for nested hierarchies, for streaming tables.
โ08-15-2025 06:27 AM
Currently, Delta Live Tables (DLT) does not support defining self-referential constraints (e.g., a foreign key pointing back to the same streaming table) at creation time, and because DLT-managed tables are immutable in terms of schema evolution through ALTER TABLE, thereโs no supported way to add such constraints later. For hierarchical or parent-child relationships within the same entity, the common workaround is to enforce the relationship at the data-processing layerโeither by implementing validation logic in your transformation code or by creating an intermediate (Silver) table that performs self-joins or integrity checks before writing to the final (Gold) table. This preserves referential integrity logically, even though the constraint is not physically declared in the table metadata.
โ08-15-2025 06:50 AM
The reasons we're interested in having the foreign key relations defined are two-fold:
โ08-15-2025 06:55 AM
I see your point โ having the foreign key definition directly in the table schema would indeed serve as valuable documentation and improve the ability of AI assistants like Genie to reason about joins and relationships. Since DLT currently doesnโt allow self-referential constraints, one potential workaround to preserve those benefits is to maintain a โdata contractโ or schema definition file (YAML/JSON) that includes these logical relationships, even if they canโt be physically enforced. This file can live alongside your pipeline code, be version-controlled, and serve both as human-readable documentation and as a source for tooling/AI prompts. Another option is to create a lightweight metadata table in Unity Catalog that lists entity relationships โ including self-references โ so itโs queryable and can be leveraged by Genie or other assistants when generating SQL. While this doesnโt enforce the constraint in the storage layer, it still provides the semantic context youโre after.
โ08-15-2025 10:49 PM
Each of these workarounds give up the optimizations that are enabled by the use of key constraints.
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now