Databricks Community

adityapa · ‎08-21-2025

Hi Everyone,

I am trying to write data to a delta table created on Unity Catalog with external location. I am using AWS EMR and below are my table and spark properties.

#### Spark Shell

```
spark-shell \
--conf "spark.sql.defaultCatalog=<catalog_name>" \
--conf "spark.sql.catalog.<catalog_name>.warehouse=<catalog_name>" \
--conf spark.databricks.unityCatalog.enabled=true \
--conf spark.hadoop.fs.s3.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
--conf "spark.sql.catalog.<catalog_name>=io.unitycatalog.spark.UCSingleCatalog" \
--conf "spark.sql.catalog.<catalog_name>.type=rest" \
--conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog" \
--conf "spark.sql.catalog.<catalog_name>.uri=https://${URI}/api/2.1/unity-catalog" \
--packages "org.apache.hadoop:hadoop-aws:3.4.1,org.apache.hadoop:hadoop-common:3.4.1,io.delta:delta-spark_2.12:3.2.1,io.unitycatalog:unitycatalog-spark_2.12:0.2.1,org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.1,io.delta:delta-iceberg_2.12:3.3.2" \
--conf "spark.sql.catalog.<catalog_name>.credential=token" \
--conf "spark.sql.catalog.<catalog_name>.token=${DATABRICKS_TOKEN}" \
--conf "spark.hadoop.fs.s3a.endpoint=s3.us-west-1.amazonaws.com" \
--conf "spark.hadoop.fs.s3a.endpoint.region=us-west-1" \
--conf "spark.hadoop.fs.s3a.region=us-west-1" \
--conf "spark.databricks.delta.uniform.iceberg.sync.convert.enabled=true" \
--conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension,org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions" \
--conf "spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider"
```

#### Table Configs :

```
CREATE EXTERNAL TABLE poc_prod_adm.v1.table8 (a STRING, b STRING, c BIGINT, d BIGINT)
USING DELTA
PARTITIONED BY (a, b)
LOCATION 's3://<bucket>/<subfolder>/<catalog_name>/<schema_name>/table8'
TBLPROPERTIES (
'delta.columnMapping.mode' = 'name',
'delta.enableIcebergCompatV2' = 'true',
'delta.universalFormat.enabledFormats' = 'iceberg',
'delta.minReaderVersion' = 2,
'delta.minWriterVersion' = 5
);
```

-----

While inserting data from AWS EMR (Spark), I am getting following error :

```

scala> spark.sql("""INSERT INTO <catalog_name>.<schema_name>.table8 (
| a,
| b,
| c,
| d
| )
| VALUES (
| 'a',
| 'b',
| 20250820,
| 20250915,
| );""");
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
25/08/20 13:13:02 WARN SparkStringUtils: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
25/08/20 13:13:17 WARN HiveConf: HiveConf of name hive.server2.thrift.url does not exist
25/08/20 13:13:18 WARN HiveConf: HiveConf of name hive.server2.thrift.url does not exist
25/08/20 13:13:18 ERROR IcebergConverter: Error when converting to Iceberg metadata
org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: [SCHEMA_NOT_FOUND] The schema `<schema_name>` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS.
```

#### Notes :
1. Our requirement is that we should be able to write the data from Spark (delta interface) and read from both delta and iceberg interface tools like spark, duckdb, trino etc.
2. We are using uniform table for our requirement and hence those properties are crucial.
2.1 Only `'delta.universalFormat.enabledFormats' = 'iceberg',` is required and other properties are added to support it (as they need to be enabled or are defaults)
3. Spark config `spark.databricks.delta.uniform.iceberg.sync.convert.enabled=true` is set to true as per the details mentioned in : https://github.com/delta-io/delta/blob/v3.3.2/spark/src/main/scala/org/apache/spark/sql/delta/source...

Any help is appreciated.

SP_6721 · ‎08-22-2025

Hi @adityapa ,

Can you first confirm that EMR can actually see your catalog and schema? Try running:

spark.sql("SHOW CATALOGS").show(false)
spark.sql("SHOW SCHEMAS IN <catalog_name>").show(false)

adityapa · ‎08-25-2025

Hi @SP_6721 ,

I am able to read data from Spark (as it is using delta logs) and able to view the schema/catalog details on Trino over EMR.

I am also able to write data to delta files in s3 using UC. However, the metadata/manifest file for iceberg is not getting updated causing the above mentioned issue.

Databricks Community

Error while inserting data to unity catalog from AWS EMR (spark) for uniform enable table

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples