cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks S3 Commit Service

ed_carv
New Contributor

Is Databricks S3 Commit Service enabled by default if Unity Catalog is not enabled and the compute resources run in our AWS account (classic compute plane)? If not, how can it be enabled?

This service seems to resolve the limitations with multi-cluster write to Delta Lake tables stored in S3 to guarantee ACID transactions.

I understand this Delta Lake limitation can also be resolved by setting up DynamoDB for delta logs, but wanted to confirm if this is still necessary as it seems Databricks has its own solution for this problem.

 

1 REPLY 1

VZLA
Databricks Employee
Databricks Employee

 

No, the Databricks S3 commit service is not guaranteed to be enabled by default in the AWS classic compute plane. The configuration may vary based on your specific workspace setup.

How can it be enabled?

To enable the Databricks S3 commit service, follow these steps:

  1. Ensure proper instance profiles are configured to grant clusters appropriate access to S3 buckets.
  2. Configure Spark parameters to explicitly enable the service and disable conflicting optimizations like direct uploads.

https://docs.databricks.com/en/security/network/classic/s3-commit-service.html

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group