File-based triggers in Databricks are designed to work with data that already resides in cloud storage (such as ADLS, S3, or GCS). In this case, since the source system is SharePoint, expecting a native file-based trigger from Databricks is not feasible.
SharePoint does not natively emit events that Databricks can directly subscribe to for real-time file ingestion. Because of this, you cannot implement a true event-driven (file-drop) trigger directly between SharePoint and Databricks.
Also note:
A native SharePoint connector/integration with Databricks is now available using lakeflow connect, but it is currently in beta mode. Because of its limited maturity, it may not yet fully support production-grade event-driven or trigger-based ingestion scenarios.
Alternative approaches:
- Scheduled ingestion (recommended baseline)
You can implement a scheduled job in Databricks using Python to periodically check SharePoint (via Microsoft Graph API or SharePoint REST API), download new or updated Excel files, and copy them into external volumes. This is the most reliable and widely used approach. - Using third-party connectors like Fivetran
Databricks has partnered with Fivetran, which provides a managed connector for SharePoint.- Automatically detects updates and ingests data incrementally
- Supports near real-time/streaming-style ingestion
- Drawback: The pipeline typically needs to remain running, which may increase cost
- Link :https://www.fivetran.com/connectors/sharepoint
- Event-driven workaround (advanced option)
If near real-time behavior is required, you can use Microsoft tools like Power Automate or Azure Logic Apps to detect file uploads in SharePoint and trigger downstream processes (e.g., call an API or trigger a Databricks job). This introduces additional components but enables near event-driven behavior.
Rohan