01-11-2024 07:06 PM
reading 130gb file without multi line true it is 6 minutes
my file has data in multi liner .
How to speed up the reading time here ..
i am using below command
01-12-2024 05:02 AM
Hi @vishwanath_1 , Can you try setting the below config if this resolves the issue?
set spark.databricks.sql.csv.edgeParserSplittable=true;
01-21-2024 10:18 PM
By using set spark.databricks.sql.csv.edgeParserSplittable=true;
There is now taking 30 mins lesser time than usual 4 hours.
Any other setting which can be used to make it faster?
01-22-2024 08:00 AM
You can also try using Photon. That can also help speed up the read operation.
09-22-2024 12:38 AM - edited 09-22-2024 12:43 AM
Hi @Lakshay , where did you find this config ? can you give link ?
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group