cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Graham
by New Contributor III
  • 5954 Views
  • 5 replies
  • 2 kudos

"MERGE" always slower than "CREATE OR REPLACE"

OverviewTo update our Data Warehouse tables, we have tried two methods: "CREATE OR REPLACE" and "MERGE". With every query we've tried, "MERGE" is slower.My question is this: Has anyone successfully gotten a "MERGE" to perform faster than a "CREATE OR...

  • 5954 Views
  • 5 replies
  • 2 kudos
Latest Reply
Manisha_Jena
New Contributor III
  • 2 kudos

Hi @Graham Can you please try Low Shuffle Merge [LSM]  and see if it helps? LSM is a new MERGE algorithm that aims to maintain the existing data organization (including z-order clustering) for unmodified data, while simultaneously improving performan...

  • 2 kudos
4 More Replies
Rishabh-Pandey
by Esteemed Contributor
  • 1028 Views
  • 1 replies
  • 3 kudos

Hey there! I've noticed that many people seem to be confused about the differences between databases, data warehouses, and data lakes. It's un...

Hey there! I've noticed that many people seem to be confused about the differences between databases, data warehouses, and data lakes. It's understandable, as these terms can be easily misunderstood or used interchangeablyHere is the summary for all ...

  • 1028 Views
  • 1 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @Rishabh Pandey​, Thank you for sharing your thoughts and information about our Databricks.We appreciate your interest in and engagement with Databricks. Please let us know if you have any further questions or comments, as we are always happy to h...

  • 3 kudos
StephanieAlba
by Valued Contributor III
  • 3951 Views
  • 2 replies
  • 3 kudos

Resolved! Best Data Model for moving from DW to Delta lake

I’m curious what Databricks recommends how we model the data. Do they recommend that the data be in 3rd normal form (3NF). Or should be it be dimensionally modeled (facts and dimensions)

  • 3951 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

It all depends on the use case.3NF is ideal for transactional systems. So for a data warehouse/lakehouse that might not be ideal.However there certainly are cases where it is interesting.Star schema's are def still relevant, BUT with the processing p...

  • 3 kudos
1 More Replies
User16790091296
by Contributor II
  • 817 Views
  • 1 replies
  • 0 kudos
  • 817 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 0 kudos

You have a couple options to write data into a Data Warehouse. Some DWs have special connectors that allow for high performance between Databricks and the DW (for example there is a Spark connector for Snowflake and for Azure Synapse DW). Some data w...

  • 0 kudos
AbhishekBreeks
by New Contributor II
  • 781 Views
  • 0 replies
  • 0 kudos

Host a Star Schema Data Warehouse on Azure Databricks

Hello, Is it a good idea to Host a Schema Data Warehouse on Azure Databricks database itself. Usually we use Azure Databricks to Prep the data and then Host it on Azure Sql Database. However question is can we not Host the data on Azure Databricks i...

  • 781 Views
  • 0 replies
  • 0 kudos
Labels