Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-24-2025 03:31 AM
Hi @daan_dw
I think this issue mainly comes from using multithreading to handle XML files while interacting with both S3 and DBFS. When the thread count gets too high, it likely causes race conditions.
To avoid this:
- Try reducing the number of threads.
- Make sure each thread writes to a unique directory to prevent any overlap.
- Use dbutils.fs.refresh() periodically to keep DBFS metadata up-to-date and avoid any latency-related errors.