08-15-2025 03:39 AM
When defining a streaming tables using DLT (declarative pipelines), we can provide a schema which lets us define primary and foreign key constraints.
However, references to self, i.e. the defining table, are not currently allowed (you get a "table not found" error.)
Since with DLT, you're not allowed to alter tables created through the framework, there's no way to define a self-referential constraint, i.e. for nested hierarchies, for streaming tables.
08-15-2025 06:27 AM
Currently, Delta Live Tables (DLT) does not support defining self-referential constraints (e.g., a foreign key pointing back to the same streaming table) at creation time, and because DLT-managed tables are immutable in terms of schema evolution through ALTER TABLE, there’s no supported way to add such constraints later. For hierarchical or parent-child relationships within the same entity, the common workaround is to enforce the relationship at the data-processing layer—either by implementing validation logic in your transformation code or by creating an intermediate (Silver) table that performs self-joins or integrity checks before writing to the final (Gold) table. This preserves referential integrity logically, even though the constraint is not physically declared in the table metadata.
08-15-2025 06:50 AM
The reasons we're interested in having the foreign key relations defined are two-fold:
08-15-2025 06:55 AM
I see your point — having the foreign key definition directly in the table schema would indeed serve as valuable documentation and improve the ability of AI assistants like Genie to reason about joins and relationships. Since DLT currently doesn’t allow self-referential constraints, one potential workaround to preserve those benefits is to maintain a “data contract” or schema definition file (YAML/JSON) that includes these logical relationships, even if they can’t be physically enforced. This file can live alongside your pipeline code, be version-controlled, and serve both as human-readable documentation and as a source for tooling/AI prompts. Another option is to create a lightweight metadata table in Unity Catalog that lists entity relationships — including self-references — so it’s queryable and can be leveraged by Genie or other assistants when generating SQL. While this doesn’t enforce the constraint in the storage layer, it still provides the semantic context you’re after.
08-15-2025 10:49 PM
Each of these workarounds give up the optimizations that are enabled by the use of key constraints.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now