topic Re: spark excel reading custom cell in Data Engineering

spark excel reading custom cell

Jothia — Wed, 06 Aug 2025 12:13:17 GMT

Hi all,

some one help me to read excel with custom format cells _(* #,##0_);_(* (#,##0);_(* "-"??_);_(@_) from databricks using spark.excel read

Re: spark excel reading custom cell

Vidhi_Khaitan — Sat, 23 Aug 2025 08:37:19 GMT

Hi @Jothia

I believe you need to replicate the display format, implement the formatting logic in Spark after reading.

Use spark excel normally which will give you raw numeric/text values -

df = (spark.read.format("com.crealytics.spark.excel") .option("header", "true") .option("inferSchema", "true") .load("dbfs:/mnt/path/to/your/file.xlsx")) df.show()

If you want to mimic the Excel format, you can apply it later in Spark -

from pyspark.sql import functions as F df_formatted = df.withColumn( "formatted_value", F.when(F.col("your_col") < 0, F.concat(F.lit("("), F.format_number(F.col("your_col"), 0), F.lit(")"))) .when(F.col("your_col") == 0, F.lit("-")) .otherwise(F.format_number(F.col("your_col"), 0)) )

Hope this helps!

Re: spark excel reading custom cell

BS_THE_ANALYST — Sat, 23 Aug 2025 12:54:16 GMT

@Jothia I'd be happy to jump in and take a look if the problem isn't resolved with the post from @Vidhi_Khaitan . Seems like an interesting problem! 👌.

I guess, as @Vidhi_Khaitan mentions, we'll need to apply those steps after we've read it in. There may also be alternate options with Python libraries which have richer support for Excel files to read the custom components in.

The .xlsx file is just a .zip file, there will certainly be ways to tear it apart and read the appropriate bits should it need to get to that. There is consideration needed here for risk vs reward and time spent. Does this really need to be automated for reading the data? How often does this occur etc? Or can you just format it once it's read it as per @Vidhi_Khaitan's answer.

All the best,
BS

Re: spark excel reading custom cell

SebastianRowan — Sat, 23 Aug 2025 17:31:51 GMT

Anyone managed to read excel files with weird custom number formats in Spark?

Re: spark excel reading custom cell

BS_THE_ANALYST — Sat, 23 Aug 2025 18:21:26 GMT

I don't think this should be too hard to handle with a python library @SebastianRowan. I'm happy to take a look if you could provide an example of where it doesn't work 👍.

All the best,
BS