how to read excel files inside a databricks notebook?

jeremy98
Honored Contributor

Hi community,

Is it possible to read excel files from dbfs using a notebook file inside Databricks? If yes, how to do it?

Stefan-Koch
Databricks Partner

jeremy98
Honored Contributor

Hello,

Thanks for your answer, but the point is that the file location is based on dbfs and seems that using a serveless compute and executing the pandas api is not possible to look at dbfs 

Stefan-Koch
Databricks Partner

In general, you shouldn't use dbfs any more, instead, use Volumes. 

But, as example, if I have an excel in my Worskpace directory, you could do this.

StefanKoch_0-1740548788545.png

%pip install openpyxl
import pandas as pd

# replace with your path
file_path = "/Workspace/Users/stefan.koch@btelligent.com/excel/FinancialsSampleData.xlsx"

# read the sheet with Name Financials1 into a pandas dataframe
pdf = pd.read_excel(file_path, sheet_name="Financials1")

# Transform the Pandas Dataframe to a Pyspark Dataframe
df = spark.createDataFrame(pdf)

display(df)

StefanKoch_1-1740548840108.png

Would this work for you or what is your dbfs path?

 

 

View solution in original post

jeremy98
Honored Contributor

amazing, yes that's is totally what I need! Thx Stefan!