From STTM to Databricks Pipelines: Can Metadata Be...

AmitDECopilot · ‎06-14-2026

I’ve been exploring a metadata-driven approach to data engineering through a project called Data Engineering Copilot.

The idea is to treat Source-to-Target Mapping (STTM) documents as structured metadata rather than static documentation.

Instead of manually translating STTM into Spark SQL, data quality checks, documentation, and pipelines, a Canonical Metadata Model could generate these artifacts automatically.

The workflow looks something like this:

STTM
↓
Canonical Metadata Model
↓
Spark SQL Generation
↓
Data Quality Rules
↓
Documentation
↓
Production Pipelines

I’m curious:

How are teams managing STTM today?
Are you using metadata-driven frameworks?
Has anyone experimented with generating Databricks assets directly from metadata?

Would love to hear how others are approaching this challenge.

Amit Kumar Singh
Lead Data Engineer | AI-Assisted Data Engineering

From STTM to Databricks Pipelines: Can Metadata Become the Source Code of Data Engineering?