10-21-2022 03:29 PM
Hi guys,
How you suggestion about how to create a medalion archeterure ? how many and what datalake zones, how store data, how databases used to store, anuthing 😃
I think that zones:
1.landing zone, file storage in /landing_zone - databricks database.bronze storage in /bronze_container
2.transformed zone, file storage in /transformation_zone - databricks databse.silver storage in /silver_container
3. insight zone, file storage in /insight_zone - databricks database.gold storage in /gold_container
but I have a question, from transformed zone the data are duplicate (/transformed_zone and /silver_container)
What do you think, what is the best practice ?
Tks
10-22-2022 01:53 AM
With lakes and Hive metastore (external tables) I did it same way.
But the way I see it nowadays:
Do you already use Unity Catalog? Is this still question there? you are more and more forced there to use managed tables. you do not more care about structure of your lake / lakehouse. It is still more and more DDL representation of data like DWH. You create the structure in your Metastore *** UC managed location (it use ids to store your tables in a storage not human readable paths).
So now the question is more how to organize your Metastore ( catalogs, databases, tables) to follow this medallion arch. then how to structure your lake containers/directories ..in my opinion.
10-24-2022 04:17 AM
Agree, although I do not like it.
10-24-2022 11:13 AM
Hi @William Scardua ,
I will highly recommend you to use Delta Live Tables (DLT) for your use case. Please check the docs with sample notebooks here https://docs.databricks.com/workflows/delta-live-tables/index.html
10-25-2022 04:03 AM
Hi @William Scardua , We haven’t heard from you since the last response from @Jose Gonzalez , and I was checking back to see if you have a resolution yet.
If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.
Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.