Writing files using multithreading to dbfs
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-11-2025 07:17 AM
Hello,
I am reading in xml files from AWS S3 and storing them on dbfs:/ using multithreaded code. The code itself seems to be fine as for the first +- 100 000 files it works without issues and I can see the data arriving on DBFS.
However it will always throw the following error:
FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/tmp_bolt/OUTBOUND_e6930c2a-a885-11ed-8d7c-00163e37acad_InformativePeakPowers_202302/OUTBOUND_e6930c2a-a885-11ed-8d7c-00163e37acad_InformativePeakPowers_202302/relfile.rels'
Note that the FileNotFoundError always has a different directory. Another note is the higher the thread count, the lower the number of files processed is before I get this error.
Any idea what is causing this?
Any idea what is causing this?