Databricks Community

Ziy_41 · ‎10-21-2024

Hi,

I have attach one excel file in data bricks edition but unfortunately it shows a diiferent langaue in ouput whice i wrote display(df). below im attaching the screenshot please let me now thanking you in advance.

Panda · ‎10-21-2024

@Ziy_41 When loading the file, try explicitly setting the encoding

.option("encoding", "UTF-8")

Stefan-Koch · ‎10-21-2024

CSV and Excel are not the same datatype.

You can load the excel data into a pandas dataframe and then convert it to a pyspark dataframe.

first, you have to install the openpyxl library

%pip install openpyxl

Then import PySpark Pandas:

import pyspark.pandas as ps

And then read the excel-data into a df:

path = "/Volumes/demo/raw/files/FinancialsSampleData.xlsx"

# create pandas-df
pdf_sheet1 = ps.read_excel(path, sheet_name="Financials1")

#convert pandas-df to pyspark-df
df_sheet1 = pdf_sheet1.to_spark()
display(df_sheet1)