Error Handling for Web Data Retrieval and Storage in Databricks UNITY Clusters
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-11-2023 03:49 AM
The following code works well in a normal Databricks cluster, where it passes a null JSON and retrieves content from the web link.
However, in a Unity cluster, it produces the following error: 'FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/raw/processing/facts/acc/dataacc.json
import requests
data1 = requests.post('https://weblink', json={})
with open('/dbfs/mnt/raw/processing/facts/acc/dataacc.json', mode='wb') as dataFile:
dataFile.write(data1.content)
why does it create a write error when using unity cluster?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-15-2024 10:22 PM
Hi @nidhin , Good Day!
The reason behind the below error while trying to access the external dbfs mount file using "with open" is that you are using a shared access mode cluster.
'FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/raw/processing/facts/acc/dataacc.json
This is a known limitation for Shared Clusters, where /dbfs path is not accessible. You can try using a single-user cluster instead to access /dbfs which supports UC.
Please refer:
https://docs.databricks.com/clusters/configure.html#shared-access-mode-limitations
https://docs.databricks.com/en/dbfs/unity-catalog.html#how-does-dbfs-work-in-shared-access-mode
And we also have a preview feature 'Improved Shared Clusters' that addresses some of the limitations of Shared Clusters.
Please let me know if this helps and leave a like if this information is useful, followups are appreciated.
Kudos
Ayushi