Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2022 04:05 AM
What I mean is that cloud storage has its limits in what it can process all at once.
Apparently in your case it is not (yet) an issue if you execute the writes at the same time.
Is it an option to process certain jobs sequentially? Or by grouping customers with the same transformations?
Another workspace could also be a (less optimal) solution, or talking to your Databricks contact.
Thinking about your use case, I would try to build some kind of framework which enables you to manage the processing more dynamically.
Easy said, I know 🙂 But nnow every new customer is a new spark script. That is a pain to manage.