Error truncating #REF with spark.read

Data Engineering

Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.

Hello guys

I am trying to read an excel file and even using PERMISSIVE mode, its truncating the records that contains #REF in any column 😥

Can anyone please help me on that?

schema = StructType([\

StructField('Col1', DateType(), True), \ <----------THIS COLUMN HAS #REF

StructField('Col2', DateType(), True), \

StructField('Col3', StringType(), True)])

test = (

spark.read.format("com.crealytics.spark.excel")

.option("header", header_option)

.option("parseMode", "PERMISSIVE")

.option("keepUndefinedRows", True)

.option("useNullForErrorCells", True)

.option("treatEmptyValuesAsNulls", True)

.option("setErrorCellsToFallbackValues", "true")

.option("maxRowsInMemory", 1000)

.option("useNullForErrorCells", True)

.option("dataAddress", "Test!A1:C200")

.schema(schema)

.load("File.xlsx")

0 REPLIES 0

never-displayed

You must be signed in to add attachments

never-displayed

Announcements

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences

Databricks Community Champion - December 2024 - Sujesh Menon

Databricks Community

Error truncating #REF with spark.read

Connect with Databricks Users in Your Area

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences

Databricks Community Champion - December 2024 - Sujesh Menon