cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Getting OOM error while loading huge zipped CSV file to the databricks Hive_metasore table

gangs
New Contributor

Is any better way to load huge zipped CSV file to hive_metastore table ?????โ€‹

1 ACCEPTED SOLUTION

Accepted Solutions

Lakshay
Databricks Employee
Databricks Employee

Hi @Ankit Gangwalโ€‹ , The problem with the zip files is that they are not splittable and only use one core to process. It is better to change the compression format to snappy as it is splittable and will allow spark to distribute the workload over the cluster.

Ref link:- https://www.linkedin.com/pulse/apache-spark-optimizations-compression-deepak-rajak

View solution in original post

3 REPLIES 3

daniel_sahal
Esteemed Contributor

@Ankit Gangwalโ€‹ 

Scale up your cluster

Lakshay
Databricks Employee
Databricks Employee

Hi @Ankit Gangwalโ€‹ , The problem with the zip files is that they are not splittable and only use one core to process. It is better to change the compression format to snappy as it is splittable and will allow spark to distribute the workload over the cluster.

Ref link:- https://www.linkedin.com/pulse/apache-spark-optimizations-compression-deepak-rajak

Anonymous
Not applicable

Hi @Ankit Gangwalโ€‹ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group