cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Lakehosue table structure design

KuldeepChitraka
New Contributor III

We are in process of implementing a lakehouse using Azure Databricks. We already have a datalake in place

  • Azure Storage Datalake – Contains containers which has data in its native format.

How we are planning

  • Build Bronze layer by create bronze tables by reading data from datalake and storing it in bronze tables
    • Tables will enforce schema
    • There will be no partitioning on bronze tables
    • Each table will have _SourceFile   & _ingestionDate column in addition other columns
  • Silver Layer
    • Tables will contains data from bronze tables after applying transformation
    • Tables will have _loadDateTime column

What else column we should have in bronze & silver tables.

What should be our partition scheme. Can we partition tables in silver by LoadDateTime

1 REPLY 1

Rishabh-Pandey
Esteemed Contributor

hey @Kuldeep Chitrakar​ as you are saying that you do not have a partitioning in bronze tables , so according to that statement that is okay . but in silver as you are going to implement partitioning so , what i will recommend to you is that for better partitioning , we always with the date column not with the datetime column because , if you make partition on the basis of date time column , that is continuosly change , we always perform partition on the SCD (slowly changing dimensions) so if you have any rare changing column in your data , then go with the columns , and if you dont have that columns so add one more in the silver tables named with "date" and make you files on the partition on date rather than loadDateTime columns

Rishabh Pandey

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group