cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
MVP Articles
This page brings together externally published articles written by our MVPs. Discover expert perspectives, real-world guidance, and community contributions from leaders across the ecosystem.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Deduplicate your data

Hubert-Dudek
Databricks MVP

Declarative pipelines are among the best ways to deduplicate your data, especially for dimensions. From AUTO_CDC() to advanced deduplication quality check #databricks

https://databrickster.medium.com/deduplicating-data-on-the-databricks-lakehouse-5-ways-36a80987c716

https://www.sunnydata.ai/blog/databricks-deduplication-strategies-lakehouse

dedups.png


My blog: https://databrickster.medium.com/
0 REPLIES 0