by
alm
• New Contributor III
- 7783 Views
- 6 replies
- 2 kudos
I have a medallion architecture: Bronze layer: Raw data in tablesSilver layer: Refined data in views created from the bronze layerGold layer: Data products as views created from the silver layerCurrently I have a data scientist that needs access to d...
- 7783 Views
- 6 replies
- 2 kudos
Latest Reply
Single-user clusters use a different security mode which is the reason for this difference.
On single-user/assigned clusters, you'll need the Fine Grained Access Control service (which is a Serverless service) - that is the solution to this problem (...
5 More Replies
- 1096 Views
- 1 replies
- 1 kudos
Hello, I have seen in many places readStream and writeStream in gold layer, Is it correct to use readStream and writeStream for gold layer ? knowing that a gold table is no not valid for streaming.is there some logic when to use readStream/ writeStr...
- 1096 Views
- 1 replies
- 1 kudos
Latest Reply
Hi @Ibrahim ISSOUANI Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.
by
Mado
• Valued Contributor II
- 3609 Views
- 4 replies
- 0 kudos
Assume that I have a data source that is ingested to a few bronze tables, and transformed to a silver table. Ans next, a gold table is created by aggregating the silver table. If new records arrive in the data source, bronze and silver tables are upd...
- 3609 Views
- 4 replies
- 0 kudos
Latest Reply
Mado
Valued Contributor II
Hi @Vidula Khanna The answer didn't fit my question. In the case of using Merge, I found a good article here:https://medium.com/@avnishjain22/simplify-optimise-and-improve-your-data-pipelines-with-incremental-etl-on-the-lakehouse-61b279afadea
3 More Replies
- 987 Views
- 0 replies
- 1 kudos
Hi all, What is the general guideline for handling flatfiles (xml, json with several nested hierarchies that is also schema evolving) in the bronze layer?Should I persist the file content into a single column as text in the parquet file or should I l...
- 987 Views
- 0 replies
- 1 kudos
by
djfliu
• New Contributor III
- 2080 Views
- 3 replies
- 4 kudos
Hi, I'm running a structured streaming job on a pipeline with a medallion architecture. In my silver layer, we are reading from the bronze layer using structured streaming, and writing the stream to the silver layer w/ a foreachbatch function doing s...
- 2080 Views
- 3 replies
- 4 kudos
Latest Reply
Hi @Danny Liu Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!
2 More Replies
- 5759 Views
- 3 replies
- 4 kudos
Hi guys,How you suggestion about how to create a medalion archeterure ? how many and what datalake zones, how store data, how databases used to store, anuthing I think that zones:1.landing zone, file storage in /landing_zone - databricks database.bro...
- 5759 Views
- 3 replies
- 4 kudos
Latest Reply
Hi @William Scardua ,I will highly recommend you to use Delta Live Tables (DLT) for your use case. Please check the docs with sample notebooks here https://docs.databricks.com/workflows/delta-live-tables/index.html
2 More Replies
by
Erik
• Valued Contributor III
- 3792 Views
- 1 replies
- 3 kudos
As many of you, we have implemented a "medallion architecture" (raw/bronze/silver/gold layers), which are each stored on seperate storrage accounts. We only create proper hive tables of the gold layer tables, so our powerbi users connecting to the da...
- 3792 Views
- 1 replies
- 3 kudos
Latest Reply
merca
Valued Contributor II
I can answer the first question:You can define data storage by setting the `path` parameter for tables. The "storage path" in pipeline settings will then only hold checkpoints (and some other pipeline stuff) and data will be stored in the correct acc...