Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Do you mean how to handle multilines in the source csv file? While using spark.read API, did you try including the multiline option set to true? please try and let us know how it goes
Can you try escape parameter & quote parameter to indicate which characters need to be ignored. The escape character within the quotes will be ignored. you can specify the newline character, that it needs to be ignored. please refer to below documentation for more info
In my case all three options are not working. still I am facing issue data is not properly separated
escape
.option("multiLine","true")
quote
Connect with Databricks Users in Your Area
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.