Error while inserting data to unity catalog from AWS EMR (spark) for uniform enable table

adityapa — Thu, 21 Aug 2025 08:32:06 GMT

Hi Everyone,

I am trying to write data to a delta table created on Unity Catalog with external location. I am using AWS EMR and below are my table and spark properties.

#### Spark Shell

```
spark-shell \
--conf "spark.sql.defaultCatalog=<catalog_name>" \
--conf "spark.sql.catalog.<catalog_name>.warehouse=<catalog_name>" \
--conf spark.databricks.unityCatalog.enabled=true \
--conf spark.hadoop.fs.s3.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
--conf "spark.sql.catalog.<catalog_name>=io.unitycatalog.spark.UCSingleCatalog" \
--conf "spark.sql.catalog.<catalog_name>.type=rest" \
--conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog" \
--conf "spark.sql.catalog.<catalog_name>.uri=https://${URI}/api/2.1/unity-catalog" \
--packages "org.apache.hadoop:hadoop-aws:3.4.1,org.apache.hadoop:hadoop-common:3.4.1,io.delta:delta-spark_2.12:3.2.1,io.unitycatalog:unitycatalog-spark_2.12:0.2.1,org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.1,io.delta:delta-iceberg_2.12:3.3.2" \
--conf "spark.sql.catalog.<catalog_name>.credential=token" \
--conf "spark.sql.catalog.<catalog_name>.token=${DATABRICKS_TOKEN}" \
--conf "spark.hadoop.fs.s3a.endpoint=s3.us-west-1.amazonaws.com" \
--conf "spark.hadoop.fs.s3a.endpoint.region=us-west-1" \
--conf "spark.hadoop.fs.s3a.region=us-west-1" \
--conf "spark.databricks.delta.uniform.iceberg.sync.convert.enabled=true" \
--conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension,org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions" \
--conf "spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider"
```

#### Table Configs :

```
CREATE EXTERNAL TABLE poc_prod_adm.v1.table8 (a STRING, b STRING, c BIGINT, d BIGINT)
USING DELTA
PARTITIONED BY (a, b)
LOCATION 's3://<bucket>/<subfolder>/<catalog_name>/<schema_name>/table8'
TBLPROPERTIES (
'delta.columnMapping.mode' = 'name',
'delta.enableIcebergCompatV2' = 'true',
'delta.universalFormat.enabledFormats' = 'iceberg',
'delta.minReaderVersion' = 2,
'delta.minWriterVersion' = 5
);
```

-----

While inserting data from AWS EMR (Spark), I am getting following error :

```

scala> spark.sql("""INSERT INTO <catalog_name>.<schema_name>.table8 (
| a,
| b,
| c,
| d
| )
| VALUES (
| 'a',
| 'b',
| 20250820,
| 20250915,
| );""");
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
25/08/20 13:13:02 WARN SparkStringUtils: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
25/08/20 13:13:17 WARN HiveConf: HiveConf of name hive.server2.thrift.url does not exist
25/08/20 13:13:18 WARN HiveConf: HiveConf of name hive.server2.thrift.url does not exist
25/08/20 13:13:18 ERROR IcebergConverter: Error when converting to Iceberg metadata
org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: [SCHEMA_NOT_FOUND] The schema `<schema_name>` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS.
```

#### Notes :
1. Our requirement is that we should be able to write the data from Spark (delta interface) and read from both delta and iceberg interface tools like spark, duckdb, trino etc.
2. We are using uniform table for our requirement and hence those properties are crucial.
2.1 Only `'delta.universalFormat.enabledFormats' = 'iceberg',` is required and other properties are added to support it (as they need to be enabled or are defaults)
3. Spark config `spark.databricks.delta.uniform.iceberg.sync.convert.enabled=true` is set to true as per the details mentioned in : https://github.com/delta-io/delta/blob/v3.3.2/spark/src/main/scala/org/apache/spark/sql/delta/sources/DeltaSQLConf.scala#L1508

Any help is appreciated.

Re: Error while inserting data to unity catalog from AWS EMR (spark) for uniform enable table

SP_6721 — Fri, 22 Aug 2025 14:45:09 GMT

Hi @adityapa ,

Can you first confirm that EMR can actually see your catalog and schema? Try running:

spark.sql("SHOW CATALOGS").show(false)
spark.sql("SHOW SCHEMAS IN <catalog_name>").show(false)

Re: Error while inserting data to unity catalog from AWS EMR (spark) for uniform enable table

adityapa — Mon, 25 Aug 2025 09:59:47 GMT

Hi @SP_6721 ,

I am able to read data from Spark (as it is using delta logs) and able to view the schema/catalog details on Trino over EMR.

I am also able to write data to delta files in s3 using UC. However, the metadata/manifest file for iceberg is not getting updated causing the above mentioned issue.

topic Re: Error while inserting data to unity catalog from AWS EMR (spark) for uniform enable table in Data Governance

Error while inserting data to unity catalog from AWS EMR (spark) for uniform enable table

Re: Error while inserting data to unity catalog from AWS EMR (spark) for uniform enable table

Re: Error while inserting data to unity catalog from AWS EMR (spark) for uniform enable table