filipniziol
Esteemed Contributor

Hi @baert23 ,

To read Excel just use pandas and convert it to spark dataframe.
That's straightforward way to work with excel.

import pandas as pd

pandas_df = pd.read_excel("path_to_your_excel_file.xlsx")
spark_df = spark.createDataFrame(pandas_df)

If you have still any issues, share the transformations you are doing. Let's see whether we can optimize further