cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Best Practices for Designing Bronze Layer with SQL Server Source in Medallion Architecture

Mohan_Baabu1
New Contributor III

Hi Databricks Experts,

I'm working on a Medallion Architecture implementation in Databricks, where the source data is coming from SQL Server. I would like some advice on how to handle the bronze layer correctly and cost-effectively.

Should I create a bronze schema in Unity Catalog and store the raw ingested tables as-is (in Delta format)?

Or is it better to simply save the raw data (as CSV or Parquet) in an ADLS container without registering Delta tables in Unity Catalog?

Additionally, from a cost perspective:

Is there any cost involved in just creating tables in Unity Catalog?

Will saving the raw data directly in ADLS (as Parquet or CSV) be more cost-efficient?

Which components incur Azure Storage vs. Databricks (DBU/compute) charges?

I'm looking to understand the proper architectural approach that balances cost and best practices for production use cases. Any suggestions or experiences would be greatly appreciated.

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

Thanks @lingareddy_Alva for your response, considering the cost effective and the medallion architecture, i need to create bronze tables in the unity catalog than saving the files directly to the ADLS container for the bronze layer.

Also considering the features of delta tables,

- Schema Evolution - Handle SQL Server schema changes gracefully
- ACID Transactions - Consistent data during ingestion
- Time Travel - Debug data issues and recover from bad loads
- Performance - Z-ordering and file compaction
- Governance - Unity Catalog lineage and access control

View solution in original post

4 REPLIES 4

lingareddy_Alva
Honored Contributor III

Hi @Mohan_Baabu1 

Let me break down the bronze layer approach for your Medallion Architecture with detailed cost and architectural considerations.

Tables in Unity Catalog:
Why This is Better Than Raw Files:
- Schema Evolution - Handle SQL Server schema changes gracefully
- ACID Transactions - Consistent data during ingestion
- Time Travel - Debug data issues and recover from bad loads
- Performance - Z-ordering and file compaction
- Governance - Unity Catalog lineage and access control

Cost Breakdown Analysis:
FREE Components:
- Creating schemas/catalogs
- Registering tables (metadata only)
- Unity Catalog metadata storage
- Lineage tracking
- Access control policies

💰 PAID Components:
- DBU consumption during queries/operations
- Underlying storage (ADLS/S3)
- Compute for optimization operations

Detailed Cost Analysis
ADLS vs Databricks Charges:
ADLS Storage Costs:
- Standard tier: $0.0184/GB/month
- Cool tier: $0.01/GB/month (for archival bronze data)
- Transaction costs: ~$0.0004 per 10k operations

Databricks Costs:
- DBU consumption during:
* Data ingestion jobs
* Table optimization (OPTIMIZE, VACUUM)
* Query execution
* Cluster startup/shutdown

Cost Optimization Tips:
- Use spot instances: 60-70% savings
- Auto-terminate idle clusters
- Schedule optimization during off-peak hours
- Use smaller clusters for bronze layer


Final Recommendation
Use Unity Catalog Delta Tables for Bronze Layer because:

Cost-effective: Storage costs are similar to Parquet, but with better compression:
 - Production-ready: ACID transactions, schema evolution, time travel
 - Governance: Built-in lineage, access control, data discovery
 - Performance: Better query performance for silver/gold transformations
 - Maintenance: Automated optimization reduces long-term costs

The additional UC overhead is minimal compared to the operational benefits, especially in production environments where data quality and governance are crucial.

 

LR

Thanks @lingareddy_Alva for your response, considering the cost effective and the medallion architecture, i need to create bronze tables in the unity catalog than saving the files directly to the ADLS container for the bronze layer.

Also considering the features of delta tables,

- Schema Evolution - Handle SQL Server schema changes gracefully
- ACID Transactions - Consistent data during ingestion
- Time Travel - Debug data issues and recover from bad loads
- Performance - Z-ordering and file compaction
- Governance - Unity Catalog lineage and access control

pgo
New Contributor III

Create the bronze table using Auto Loader and store it in Delta format. Although it might seem like you'll only read from bronze once to populate the silver layer, in real-world production scenarios, you'll often need to re-read from bronze—for reprocessing, debugging, or new use cases. Using Delta ensures those repeated reads are performant and reliable, compared to reading directly from raw formats like JSON or CSV.

Mohan_Baabu1
New Contributor III

thanks @pgo 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now