- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-20-2021 08:57 AM
Please note, the issue noted above [Storage System] Support for AWS S3 (multiple clusters/drivers/JVMs) is for Delta Lake OSS. As noted in this issue as well as Issue 324, as of this writing, S3 lacks putIfAbsent transactional consistency. For Delta Lake OSS, the community is working on PR 339 to resolve this issue.
Saying this, your question is specific to Databricks' implementation of Delta which allows for multiple clusters to concurrently write to the same Delta table using the S3 commit service. The pertinent quote is:
Databricks runs a commit service that coordinates writes to Amazon S3 from multiple clusters. This service runs in the Databricks control plane
For more information, please refer to Configure Databricks S3 commit service-related settings