Databricks Community

JonLaRose · ‎09-26-2023

Hi there!

I'm trying to figure out how the multi-writers architecture for Delta Lake tables is implemented under the hood.

I understand that a DynamoDB table is used to provide mutual exclusion, but the question is: where is the table located? Is it in the control plane compute or the user's?

If it's in the data plane, how can I provide permissions to create/update this specific table?

If it's in the control plane compute, why is it failing with the following:

Py4JJavaError: An error occurred while calling o476.save.
: com.amazonaws.services.securitytoken.model.AWSSecurityTokenServiceException: The security token included in the request is invalid.

?

Thanks!

JonLaRose · ‎10-08-2023

Thank you @Retired_mod .

Does the S3 Commit service use the `s3a` configured S3 endpoint (from the Spark session Hadoop configurations)? If not, is there a way to configure the S3 endpoint that the S3 Commit service uses?

prem14f · ‎07-26-2024

Hi, could you please help me here? How can i use this configuration in DataBricks?
So I will maintain my transcription logs there, and with Parallel, I can use the Delta-RS job.

spark.conf.set("spark.delta.logStore.s3a.impl", "io.delta.storage.S3DynamoDBLogStore")

spark.conf.set("spark.io.delta.storage.S3DynamoDBLogStore.ddb.tableName", "delta_log")

spark.conf.set("spark.io.delta.storage.S3DynamoDBLogStore.ddb.region", "eu-west-1")

Databricks Community

Delta Lake S3 multi-cluster writes - DynamoDB

Connect with Databricks Users in Your Area

Insights from a global survey of 1,100 technologists and interviews with 28 CIOs

Data + AI Summit: Call for Presentations

Season's Speedings: Databricks SQL Delivers 4x Performance Boost Over Two Years

Now Hiring: Databricks Community Technical Moderator

Become Our Next Monthly Community Champion!