cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

KuldeepChitraka
by New Contributor III
  • 1471 Views
  • 3 replies
  • 6 kudos

Performance Issue : Create DELTA table form 2 TB PARQUET file

We are trying to create a DELTA table (CTAS statement) from 2 TB PARQUET file and its taking huge amount of time around 12~ hrs.is it normal.? What are option to tune/optimize this ? are we doing anything wrongCluster : Interactive/30 Cores / 320 GB ...

  • 1471 Views
  • 3 replies
  • 6 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 6 kudos

Please use COPY INTO (first create an empty delta table) or CONVERT TO DELTA instead of CTAS it will be much more faster, and it process will be auto-optimized.

  • 6 kudos
2 More Replies
Kopal
by New Contributor II
  • 4983 Views
  • 3 replies
  • 3 kudos

Resolved! Data Engineering - CTAS - External Tables - Limitations of CTAS for external tables - can or cannot use options and location

Data Engineering - CTAS - External TablesCan someone help me understand why In chapter 3.3, we cannot not directly use CTAS with OPTIONS and LOCATION to specify delimiter and location of CSV?Or I misunderstood?Details:In Data Engineering with Databri...

  • 4983 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

The 2nd statement CTAS will not be able to parse the csv in any manner because it's just the from statement that points to a file. It's more of a traditional SQL statement with select and from. It will create a Delta Table. This just happens to b...

  • 3 kudos
2 More Replies
Labels