cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

brickster_2018
by Esteemed Contributor
  • 6425 Views
  • 2 replies
  • 0 kudos

Resolved! How does Delta solve the large number of small file problems?

Delta creates more small files during merge and updates operations.

  • 6425 Views
  • 2 replies
  • 0 kudos
Latest Reply
brickster_2018
Esteemed Contributor
  • 0 kudos

Delta solves the large number of small file problems using the below operations available for a Delta table. Optimize writes helps to optimizes the write operation by adding an additional shuffle step and reducing the number of output files. By defau...

  • 0 kudos
1 More Replies
kerala_tourism
by New Contributor
  • 403 Views
  • 0 replies
  • 0 kudos

Tourism attractions in kerala are described here. Kerala has a rich tourism background, which contributes much to the economy. Tourism is the way of i...

Tourism attractions in kerala are described here. Kerala has a rich tourism background, which contributes much to the economy. Tourism is the way of income for a large number of people in Kerala. National parks, wild life sanctuaries, etc. are the ma...

  • 403 Views
  • 0 replies
  • 0 kudos
pantelis_mare
by Contributor III
  • 3664 Views
  • 5 replies
  • 5 kudos

Resolved! Slow imports for concurrent notebooks

Hello all,I have a large number of light notebooks to run so I am taking the concurrent approach launching notebook runs with dbutils.notebook.run in parallel. The more I increase parallelism the more I see the duration of each notebook increasing.I ...

  • 3664 Views
  • 5 replies
  • 5 kudos
Latest Reply
pantelis_mare
Contributor III
  • 5 kudos

Hello @Kaniz Fatma​ yes it is clear.Following some tests on my side using a ***** notebook that all it does is importing stuff and sleeping for 15 secs (so nothing to do with spark) I figured that even with a 32 cores driver, the fatigue point is clo...

  • 5 kudos
4 More Replies
Labels