Error Handling for Web Data Retrieval and Storage in Databricks UNITY Clusters

nidhin
New Contributor

The following code works well in a normal Databricks cluster, where it passes a null JSON and retrieves content from the web link.

However, in a Unity cluster, it produces the following error: 'FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/raw/processing/facts/acc/dataacc.json

import requests

data1 = requests.post('https://weblink', json={})

with open('/dbfs/mnt/raw/processing/facts/acc/dataacc.json', mode='wb') as dataFile:
dataFile.write(data1.content)

 

why does it create a write error  when using unity cluster?

 

 

Ayushi_Suthar
Databricks Employee
Databricks Employee

Hi @nidhin , Good Day! 

The reason behind the below error while trying to access the external dbfs mount file using "with open" is that you are using a shared access mode cluster.

'FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/raw/processing/facts/acc/dataacc.json

This is a known limitation for Shared Clusters, where /dbfs path is not accessible. You can try using a single-user cluster instead to access /dbfs which supports UC.

Please refer:
https://docs.databricks.com/clusters/configure.html#shared-access-mode-limitations
https://docs.databricks.com/en/dbfs/unity-catalog.html#how-does-dbfs-work-in-shared-access-mode

And we also have a preview feature 'Improved Shared Clusters' that addresses some of the limitations of Shared Clusters.

Please let me know if this helps and leave a like if this information is useful, followups are appreciated.
Kudos
Ayushi