Databricks supports creating and working with Apache Iceberg tables natively under specific conditions. Managed Iceberg tables in Unity Catalog can be created directly using Databricks Runtime 16.4 LTS or newer. The necessary setup requires enabling the Managed Iceberg private preview and adhering to specified requirements, such as enabling Unity Catalog and ensuring external schema access permissions. Once configured, users can create these tables using SQL commands like CREATE OR REPLACE TABLE <catalog>.<schema>.<table>
with USING iceberg
.
However, for external (foreign) Iceberg tables where metadata is managed outside Databricks (e.g., in Glue or Snowflake catalogs), Databricks only allows read access. Additionally, Iceberg tables written by third-party tools remain read-only in Databricks. In contrast, Managed Iceberg tables in Databricks allow writing through Databricks-specific integrations using Iceberg REST Catalog APIs, enabling interoperability with external Iceberg clients like Spark, Flink, and Trino.
For clusters running Apache Spark 3.5.2, an Iceberg JAR compatible with Spark must be loaded alongside configuration steps, including setting proper extensions and specifying Iceberg catalog details. Without proper configuration, errors like the one encountered (Failed to find the data source: iceberg
) may arise. For optimal compatibility and functionality, users should follow Databricks' guidelines and preview-specific configurations.
Hope this helps, Lou.