cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ninjadev999
by New Contributor II
  • 4617 Views
  • 7 replies
  • 1 kudos

Resolved! Can't write big DataFrame into MSSQL server by using jdbc driver on Azure Databricks

I'm reading a huge csv file including 39,795,158 records and writing into MSSQL server, on Azure Databricks. The Databricks(notebook) is running on a cluster node with 56 GB Memory, 16 Cores, and 12 workers.This is my code in Python and PySpark:from ...

  • 4617 Views
  • 7 replies
  • 1 kudos
Latest Reply
User16764241763
Honored Contributor
  • 1 kudos

Hi,If you are using Azure SQL DB Managed instance, could you please file a support request with Azure team? This is to review any timeouts, perf issues on the backend.Also, it seems like the timeout is coming from SQL Server which is closing the conn...

  • 1 kudos
6 More Replies
saipujari_spark
by Valued Contributor
  • 830 Views
  • 1 replies
  • 3 kudos

Delta Optimized Write vs Reparation, Which is recommended?

When streaming to a Delta table, both repartitioning on the partition column and optimized write can help to avoid small files.Which is recommended between Delta Optimized Write vs Repartitioning?

  • 830 Views
  • 1 replies
  • 3 kudos
Latest Reply
saipujari_spark
Valued Contributor
  • 3 kudos

 Optimized write is recommended over repartitioning for the below reasons.* The key part of Optimized Writes is that it is an adaptive shuffle. If you have a streaming ingest use case and input data rates change over time, the adaptive shuffle will a...

  • 3 kudos
Kaniz
by Community Manager
  • 3936 Views
  • 1 replies
  • 0 kudos
  • 3936 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Honored Contributor III
  • 0 kudos

Repartition triggers a full shuffle of data and distributes the data evenly over the number of partitions and can be used to increase and decrease the partition count. Coalesce is typically used for reducing the number of partitions and does not requ...

  • 0 kudos
Srikanth_Gupta_
by Valued Contributor
  • 662 Views
  • 1 replies
  • 0 kudos
  • 662 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

coalesce avoids a full shuffle and could be used to decrease the number of partitionsrepartition results in a full shuffle and could be used to increase or decrease the number of partitions

  • 0 kudos
Labels