cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Code reusability for silver table transformations

jitesh
New Contributor

How/how many databricks notebooks should be created to populate multiple silver delta tables, all having different and complex transformations ? What's the best practice -

1. create a notebook each for a silver table ?

2. push SQL transformation logic to a config table ? And create a single reusble notebook for to be used by all the tables ? Or some other approach ?

PS - Created just a single notebook to convert the raw parquet tables to delta format, as the tranfromations were generic for all the bronze tables.

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @jitesh, When organizing your Databricks Notebooks for multiple silver Delta tables with different and complex transformations, it’s essential to follow best practices.

Here are some recommendations:

  1. Separate Notebooks for Each Layer:

    • Bronze Layer (Raw Data Layer): Create a separate notebook for each bronze table. Use the prefix “bronze_” followed by the source system or data source and the object’s name (e.g., bronze_salesforce_opportunities). Store data in Delta Lake format for performance, ACID transactions, and schema evolution capabilities.
    • Silver Layer (Cleansed and Enriched Data Layer): Similarly, create separate notebooks for each silver table. Use the prefix “silver_” followed by the functional area or business domain (e.g., silver_finance_transactions). Apply necessary data quality checks, type conversions, and enrichment processes.
    • Gold Layer (Aggregated and Business-Ready Data Layer): Again, create separate notebooks for each gold table. Use the prefix “gold_” followed by the functional area or business domain (e.g., gold_sales_monthly_summary). Perform aggregations and calculations as required by business requirements.
  2. Code Organization and Hierarchy:

  3. Additional Considerations:

Remember that the right approach may vary based on your specific use case and requirements. However, following these guidelines will help you maintain consistency, readability, and manageability in your data engineering project. 😊🚀

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group