Databricks Community

petergriffin1 · ‎06-04-2025

Been trying to create a iceberg table natively in databricks with the cluster being 16.4. I also have the Iceberg JAR file for 3.5.2 Spark.

Using a simple command such as:

%sql
CREATE OR REPLACE TABLE catalog1.default.iceberg(
    a INT
)
USING iceberg;

is running into a error of: "Failed to find the data source: iceberg. Make sure the provider name is correct and the package is properly registered and compatible with your Spark version".

My question is can we build these iceberg tables natively in Databricks (assuming private preview is also turned on + JAR file is loaded) or do we have to use a external client to build it and then push it to Databricks somehow? Or is it just specific formats (Parquet, etc) thats allowed?

Louis_Frolio · ‎06-04-2025

Databricks supports creating and working with Apache Iceberg tables natively under specific conditions. Managed Iceberg tables in Unity Catalog can be created directly using Databricks Runtime 16.4 LTS or newer. The necessary setup requires enabling the Managed Iceberg private preview and adhering to specified requirements, such as enabling Unity Catalog and ensuring external schema access permissions. Once configured, users can create these tables using SQL commands like CREATE OR REPLACE TABLE <catalog>.<schema>.<table> with USING iceberg.

However, for external (foreign) Iceberg tables where metadata is managed outside Databricks (e.g., in Glue or Snowflake catalogs), Databricks only allows read access. Additionally, Iceberg tables written by third-party tools remain read-only in Databricks. In contrast, Managed Iceberg tables in Databricks allow writing through Databricks-specific integrations using Iceberg REST Catalog APIs, enabling interoperability with external Iceberg clients like Spark, Flink, and Trino.

For clusters running Apache Spark 3.5.2, an Iceberg JAR compatible with Spark must be loaded alongside configuration steps, including setting proper extensions and specifying Iceberg catalog details. Without proper configuration, errors like the one encountered (Failed to find the data source: iceberg) may arise. For optimal compatibility and functionality, users should follow Databricks' guidelines and preview-specific configurations.

Hope this helps, Lou.

View solution in original post

Louis_Frolio · ‎06-04-2025

Databricks supports creating and working with Apache Iceberg tables natively under specific conditions. Managed Iceberg tables in Unity Catalog can be created directly using Databricks Runtime 16.4 LTS or newer. The necessary setup requires enabling the Managed Iceberg private preview and adhering to specified requirements, such as enabling Unity Catalog and ensuring external schema access permissions. Once configured, users can create these tables using SQL commands like CREATE OR REPLACE TABLE <catalog>.<schema>.<table> with USING iceberg.

However, for external (foreign) Iceberg tables where metadata is managed outside Databricks (e.g., in Glue or Snowflake catalogs), Databricks only allows read access. Additionally, Iceberg tables written by third-party tools remain read-only in Databricks. In contrast, Managed Iceberg tables in Databricks allow writing through Databricks-specific integrations using Iceberg REST Catalog APIs, enabling interoperability with external Iceberg clients like Spark, Flink, and Trino.

For clusters running Apache Spark 3.5.2, an Iceberg JAR compatible with Spark must be loaded alongside configuration steps, including setting proper extensions and specifying Iceberg catalog details. Without proper configuration, errors like the one encountered (Failed to find the data source: iceberg) may arise. For optimal compatibility and functionality, users should follow Databricks' guidelines and preview-specific configurations.

Hope this helps, Lou.

petergriffin1 · ‎06-04-2025

Hey Lou,

That helps quite a bit, clears up the confusion on my end. Quick question, the managed private preview, is it enabled by Databricks (the account rep)? I'm assuming the answer is yes here, just wanted to make sure.

Thanks!

Louis_Frolio · ‎06-04-2025

Hey Petergriffin1, I don't know the exact process but your best bet is what you suggested above. Start with your AE, he/she may direct you to your SA and they will take it from there. Hope this help. Lou.

Databricks Community

Are you able to create a iceberg table natively in Databricks?

Join Us as a Local Community Builder!

Free Edition Hackathon

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples

Level Up with Databricks Specialist Sessions

🌟 Community Pulse: Your Weekly Roundup! November 07 – 13, 2025

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐