Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tuesday
This is a good discussion topic, but from my experience right now it is both meta data driven and most traditional excel based STMs.
A few observations:
How most teams manage STTM today
Level 1 (Most Common)
- STTM in Excel, Word, or Confluence.
- Engineers manually translate mappings into Spark SQL, dbt, Informatica, ADF, etc.
- Documentation becomes stale quickly.
- Data quality rules are implemented separately from mappings.
Level 2 (Maturing Teams)
- STTM stored in structured tables.
- Reusable ETL framework reads metadata for:
- Source tables
- Target tables
- Incremental logic
- Column mappings
- Audit columns
- Pipeline orchestration becomes metadata-driven.
- Still, transformation logic is often manually coded.
Level 3 (Advanced Teams)
- Metadata repository acts as the single source of truth.
- Code generation produces:
- SQL
- ETL pipelines
- DQ rules
- Documentation
- Lineage
- Human review before deployment.