cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

DLT x UC x Auto Loader

mplang
New Contributor

Now that the Directory Listing Mode of Auto Loader is officially deprecated, is there a solution for using File Notification Mode in a DLT pipeline writing to a UC-managed table? My understanding is that File Notification Mode is only available on single-user mode clusters, but those aren't available in the DLT+UC space. Am I missing something?

3 REPLIES 3

drewipson
New Contributor III

I am having the same concern and am reaching out to our Solutions Architect to better understand how AutoLoader &DLT can be used. DLT and AutoLoader should go hand in hand especially when using file notification mode.

stbjelcevic
Databricks Employee
Databricks Employee

Yes: use Auto Loaderโ€™s UC-managed file events in DLT with DBR 14.3 LTS or later; enable file events on the UC external location and set managed file events in the stream. This avoids the legacy file notificationsโ€™ single-user compute limitation and works with UC-managed tables.

  • File events for Auto Loader on UC are supported on DBR 14.3+ and recommended over directory listing for performance/cost.

  • The โ€œsingle-user onlyโ€ limitation applies to legacy per-stream file notifications, not to UC-managed file events.

  • Run your streams at least every seven days to avoid fallback directory listing; changing source paths isnโ€™t supported in notification mode.

  • If you canโ€™t enable file events, directory listing still works (incremental listing is deprecatedโ€”leave it off).

Raman_Unifeye
Contributor III

Databricks introduced Managed File Events which completely bypasses the need for the cluster's identity to provision cloud resources, resolving the conflict with the Shared cluster mode.

Steps to Implement in DLT

  1. Enable File Events on the External Location:

    • A Workspace Admin or privileged user must go into the Unity Catalog interface.

    • Find the External Location that points to your source storage bucket/container.

    • Edit the External Location and Enable File Events. Databricks will automatically provision the necessary notification service (e.g., SQS/SNS, Azure Queue/Event Grid) using the Storage Credential linked to the External Location.

  2. Configure Auto Loader in the DLT Code: In your DLT pipeline code (Python or SQL), simply set the option for Auto Loader.


RG #Driving Business Outcomes with Data Intelligence