cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Graham
by New Contributor III
  • 4775 Views
  • 5 replies
  • 2 kudos

"MERGE" always slower than "CREATE OR REPLACE"

OverviewTo update our Data Warehouse tables, we have tried two methods: "CREATE OR REPLACE" and "MERGE". With every query we've tried, "MERGE" is slower.My question is this: Has anyone successfully gotten a "MERGE" to perform faster than a "CREATE OR...

  • 4775 Views
  • 5 replies
  • 2 kudos
Latest Reply
Manisha_Jena
New Contributor III
  • 2 kudos

Hi @Graham Can you please try Low Shuffle Merge [LSM]  and see if it helps? LSM is a new MERGE algorithm that aims to maintain the existing data organization (including z-order clustering) for unmodified data, while simultaneously improving performan...

  • 2 kudos
4 More Replies
Rishabh-Pandey
by Honored Contributor II
  • 829 Views
  • 1 replies
  • 2 kudos

Hey there! I've noticed that many people seem to be confused about the differences between databases, data warehouses, and data lakes. It's un...

Hey there! I've noticed that many people seem to be confused about the differences between databases, data warehouses, and data lakes. It's understandable, as these terms can be easily misunderstood or used interchangeablyHere is the summary for all ...

  • 829 Views
  • 1 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Rishabh Pandey​, Thank you for sharing your thoughts and information about our Databricks.We appreciate your interest in and engagement with Databricks. Please let us know if you have any further questions or comments, as we are always happy to h...

  • 2 kudos
StephanieRivera
by Valued Contributor II
  • 3059 Views
  • 2 replies
  • 3 kudos

Resolved! Best Data Model for moving from DW to Delta lake

I’m curious what Databricks recommends how we model the data. Do they recommend that the data be in 3rd normal form (3NF). Or should be it be dimensionally modeled (facts and dimensions)

  • 3059 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

It all depends on the use case.3NF is ideal for transactional systems. So for a data warehouse/lakehouse that might not be ideal.However there certainly are cases where it is interesting.Star schema's are def still relevant, BUT with the processing p...

  • 3 kudos
1 More Replies
User16790091296
by Contributor II
  • 705 Views
  • 1 replies
  • 0 kudos
  • 705 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Honored Contributor III
  • 0 kudos

You have a couple options to write data into a Data Warehouse. Some DWs have special connectors that allow for high performance between Databricks and the DW (for example there is a Spark connector for Snowflake and for Azure Synapse DW). Some data w...

  • 0 kudos
AbhishekBreeks
by New Contributor II
  • 682 Views
  • 0 replies
  • 0 kudos

Host a Star Schema Data Warehouse on Azure Databricks

Hello, Is it a good idea to Host a Schema Data Warehouse on Azure Databricks database itself. Usually we use Azure Databricks to Prep the data and then Host it on Azure Sql Database. However question is can we not Host the data on Azure Databricks i...

  • 682 Views
  • 0 replies
  • 0 kudos
Labels