03-27-2025 01:12 PM
Hello. We are using Azure Databricks and would like to ingest data from a specific M365 SharePoint Online Site/List. I was originally trying to use this recommendation, https://learn.microsoft.com/en-us/answers/questions/2116616/service-principal-access-to-sharepoint-o... however; get errors on the last step using a Service Principal which was the original recommendation. I am reaching out here as I'm looking for a better way to ingest/pull in the Data from SharePoint into Databricks.
Has anyone does this successfully? What method did you use?
Any assistance is appreciated.
Thanks in advance.
04-02-2025 12:58 AM
I don´t know if you use Data Factory, but it has a sharepoint connector which we use to fetch online lists.
Use your data lake as a sink and unleash databricks on it.
2 weeks ago
Hi there, I'm really new into ADF and I'm trying to work with the Sharepoint Lists connector, however I'm having problem with some 'User Information List' columns, do you maybe had this problem too? or maybe do you now how to 'expand' this kind of columns?
Thanks in advance for your answer.
04-02-2025 06:40 AM
Hello, thanks for your response. Do you have a reference? We have used ADF in the past, but a bit rusty in that realm. Thanks.
04-02-2025 06:51 AM
https://learn.microsoft.com/en-us/azure/data-factory/connector-sharepoint-online-list?tabs=data-fact...
it is no big deal tbh.
but if you do not use adf at the moment it might be overkill though.
04-02-2025 11:50 AM
Maybe, but at this point, we just need a method to get data from the specific SharePoint Online site reliability to pull into Databricks. So open to the easiest, most efficient method. Thanks.
2 weeks ago
The most reliable and efficient approach is to use Azure Data Factory (ADF) with the SharePoint Online List connector to extract data from your M365 SharePoint List and write it to Azure Data Lake Storage, then use Azure Databricks to process the data from there. This avoids service principal permission issues and gives you a low-code, scalable pipeline. If you're seeing issues with expanding complex columns (like 'User Information List'), use ADF's Data Flow or post-processing in Databricks to flatten those structures. Here's a quick start guide: ADF SharePoint List Connector.
2 weeks ago
We achieved the same using the SharePoint API. You can follow the steps outlined in this documentation: https://learn.microsoft.com/en-us/graph/auth-v2-service?tabs=http.
Additionally, you can grant the Sites.Selected permission to the Azure AD application you're using for API calls.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now