Writing files using multithreading to dbfs

daan_dw
New Contributor III

Hello,

I am reading in xml files from AWS S3 and storing them on dbfs:/ using multithreaded code. The code itself seems to be fine as for the first +- 100 000 files it works without issues and I can see the data arriving on DBFS.

However it will always throw the following error: 

FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/tmp_bolt/OUTBOUND_e6930c2a-a885-11ed-8d7c-00163e37acad_InformativePeakPowers_202302/OUTBOUND_e6930c2a-a885-11ed-8d7c-00163e37acad_InformativePeakPowers_202302/relfile.rels'
 
Note that the FileNotFoundError always has a different directory. Another note is the higher the thread count, the lower the number of files processed is before I get this error.

Any idea what is causing this?

Screenshot 2025-04-11 at 16.14.04.png