cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

File Trigger using azure file share in unity Catalog

angel_ba
New Contributor II

Hello,

 

I have got the unity catalog eanbled in my workspace. the file srae manually copied by customers in azure file share(domain joint account, wabs) on adhoc basis. I would like to add a file trigger on the job so that as soon as file arrives in the file share it should get copied to my container(abfss) and triggers the job. But I am not able to mount azure file share but only abfss. How can I resolve it?

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @angel_ba

  1. Unity Catalog and Azure Data Lake Storage Gen2 (ADLS Gen2):

    • Unity Catalog is a powerful feature in Azure Databricks that allows you to configure access to ADLS Gen2 and volumes for direct interaction with files. It simplifies the process of managing credentials and connecting to storage.
    • I recommend using Unity Catalog to set up access to your ADLS Gen2 account. You can find detailed instructions in the official documentation on how to connect to cloud object storage using Unity Catalog.
  2. Mounting Azure File Share (WABS) and ABFSS:

    • To mount an Azure File Share (WABS) or an Azure Blob File System (ABFSS), you have a few options:
      • Azure Blob File System (ABFSS): This is the recommended approach for interacting with ADLS Gen2. ABFSS provides several benefits over the legacy Windows Azure Storage Blob driver (WASB). You can use it to mount ADLS Gen2 directly.
      • Mounting Azure File Share (WABS): While mounting Azure File Shares directly in Databricks is possible, it’s not the preferred method. However, if you still need to do this, follow the steps below.
  3. Mounting Azure File Share (WABS):

    • To mount an Azure File Share (WABS) in Databricks, you can use the following steps:
      1. Sign in to the Azure portal.
      2. Navigate to the storage account that contains the file share you’d like to mount.
      3. Select “File shares” and choose the specific file share.
      4. Click “Connect” and select the drive letter to mount the share to.
      5. Copy the provided script.
    • You can then execute this script in your Databricks Notebook to mount the Azure File Share.
  4. Triggering Jobs on File Arrival:

    • Once you’ve mounted the Azure File Share (WABS) or set up Unity Catalog for ADLS Gen2, you can create a file trigger to automatically start a job when a new file arrives.
    • Use Databricks’ built-in file event triggers to monitor the file share or ADLS Gen2 directory for changes. When a new file appears, trigger your desired job.
    • You can set up these triggers programmatically or through the Databricks UI.

Remember to choose the approach that best aligns with your requirements. If possible, I recommend using ABFSS and Unity Catalog for seamless integration with ADLS Gen2. If you still need to work with Azure File Shares, follow the steps outlined above. Happy data processing! 🚀🔍📂

 
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.