cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to Read Shared Drive Data in Databricks

Akshay_Petkar
Valued Contributor

Hi everyone,

I am working on a project where the data is stored on a Shared Drive. How can I read an Excel file from the Shared Drive into a Databricks notebook?

Thanks,

Akshay Petkar
4 REPLIES 4

BS_THE_ANALYST
Esteemed Contributor II

@Akshay_Petkar , is the Shared Drive something like a Sharepoint Site's document library? It's not uncommon for businesses to sync these locally. If yes, then you can just use a Sharepoint Connector or build one. 

If this is just a typical network drive, there's really no way to expose this directly to Databricks. However, if you leverage the Databricks SDK/CLI/API, then you could move that file programmatically from the Shared Drive into your Databricks environment ๐Ÿ™‚. Once it's in your databricks environment, you can then read the excel file. If your IT team don't allow this, you can always propose an SFTP route instead.

I wrote a blog on getting data into databricks manually/programmatically, in here I used the CLI to move it from a file from my machine to my Databricks environment:
https://community.databricks.com/t5/community-articles/episode-1-getting-data-in-learning-databricks... 

And I wrote a blog on working with Excel files in Databricks: 
https://community.databricks.com/t5/community-articles/episode-2-reading-excel-files-learning-databr... 

If you don't know whether to use the CLI/SDK/API here's a great article to explain the differences: https://alexott.blogspot.com/2024/09/databricks-sdks-vs-cli-vs-rest-apis-vs.html 

All the best,
BS

szymon_dybczak
Esteemed Contributor III

Hi @Akshay_Petkar ,

Could you provide more information. Share drive is pretty broad term. It could be Windows SMB / CIFS share , AWS FSx,, Google Shared Drive etc.

Hi @szymon_dybczak ,

It is a Windows SMB / CIFS share.

Thanks,

Akshay Petkar

szymon_dybczak
Esteemed Contributor III

Hi @Akshay_Petkar ,

File Shares are not natively supported in Databricks. Moreover, you need to ensure that you have a network connectivity between your cluster and a server your share resides. If your share is on-premises you need to configure all networking first (VNet injection of your workspace, configuring gateway etc).
Then you can write some python notebook to read data.

So, to put it simple - you can use Python code to read from a share.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now