cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Connect to data in one drive to Azure Databricks

chari
Contributor

Hello,

A colleague of mine previously built a data pipeline for connecting data available on share point (one drive), coded in python in jupyter notebook. Now, its my job to transfer the code to Azure databricks and I am unable to connect/download this data to the new platform. 

Some hinted that its impossible. So I am seeking help here to understand a work around. If you had any experience please take a moment to type here. It helps !!!

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

gabsylvain
New Contributor III
New Contributor III

Hey @chari ,

In my experience, custom Python code to interact with third party application through REST APIs that was build on Jupyter should also work within Databricks. There's even a Python client library that already exists for this: https://pypi.org/project/Office365-REST-Python-Client/

Otherwise, CData offers a JDBC connector to Sharepoint, but it is licensed so you might have to pay to use it: https://www.cdata.com/kb/tech/sharepoint-jdbc-azure-databricks.rst

It would help if you could share a bit more information on how your colleague previously built the data pipeline and also if you could share error messages / stacktrace of when it fails on Azure Databricks.

Thanks,

Gab

View solution in original post

4 REPLIES 4

gabsylvain
New Contributor III
New Contributor III

Hey @chari ,

In my experience, custom Python code to interact with third party application through REST APIs that was build on Jupyter should also work within Databricks. There's even a Python client library that already exists for this: https://pypi.org/project/Office365-REST-Python-Client/

Otherwise, CData offers a JDBC connector to Sharepoint, but it is licensed so you might have to pay to use it: https://www.cdata.com/kb/tech/sharepoint-jdbc-azure-databricks.rst

It would help if you could share a bit more information on how your colleague previously built the data pipeline and also if you could share error messages / stacktrace of when it fails on Azure Databricks.

Thanks,

Gab

Hello Gab,

 

DATA = os.getenv("name", f"C:/Users/{os.getlogin()}/Group/Project/file.xlsx")
 
is how the code was originally written. 
"name" is the environment name.

Hello Gab,

Using rest api is a good idea. However, thats new to me. 

Thanks

gabsylvain
New Contributor III
New Contributor III

@chari Also you ingest both Sharepoint and OneDrive data directly into Databricks using Partner Connect.

You can refer to the documentation bellow for more information:

 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.