cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

how to access snapshots in iceberg tables?

gaurang033
New Contributor II

I have created an iceberg tables in databricks, and inserted bunch of values in it. 

how do I list the snapshot and other metadata of the tables. 

create table raw.landing.emp_ice(id int, name string ) using iceberg

following doesn't work 

https://iceberg.apache.org/docs/latest/spark-queries/#snapshots

select * from raw.landing.emp_ice.snapshots;

 Error 

`raw`.`landing`.`emp_ice`.`snapshots` is not a valid identifier as it has more than 2 name parts. SQLSTATE: 42601

 

2 REPLIES 2

Ale_Armillotta
Valued Contributor II

Hi @gaurang033 ,                                                                                                                                                               I'm not an Iceberg expert, but I did some research and tests and I think I can point you in the right direction.                

The error you're seeing (more than 2 name parts. SQLSTATE: 42601) happens because, I suppose, Databricks SQL does not support 4-part identifiers in the FROM clause โ€” so catalog.schema.table.snapshots gets rejected by the SQL parser.

But there's also a deeper reason: in Databricks, CREATE TABLE ... USING iceberg doesn't always create a true native Iceberg table. The behavior depends on whether Predictive Optimization is enabled on your workspace:

- Without Predictive Optimization โ†’ Databricks creates a Delta table with Iceberg UniForm (Iceberg metadata is generated asynchronously on top of Delta). In this case, use DESCRIBE HISTORY to inspect snapshots:

DESCRIBE HISTORY raw.landing.emp_ice;

- With Predictive Optimization โ†’ Databricks creates a native managed Iceberg table. In this case, the .snapshots syntax should work, but only via PySpark (not Databricks SQL):

spark.sql("SELECT * FROM raw.landing.emp_ice.snapshots").display()

To check which type of table you actually have, run:

DESCRIBE EXTENDED raw.landing.emp_ice;

- If you see a Delta Uniform Iceberg section โ†’ it's a Delta+UniForm table โ†’ use DESCRIBE HISTORY

- If you see Provider: iceberg without Delta references โ†’ it's a native Iceberg table โ†’ use PySpark

You can also check the Iceberg metadata generation status with:

SHOW TBLPROPERTIES raw.landing.emp_ice;

and look for converted_delta_version and converted_delta_timestamp.

Hope this helps! If you found my answer useful, feel free to give me a Kudo ๐Ÿ™‚

Louis_Frolio
Databricks Employee
Databricks Employee

Greetings @gaurang033 , 

You're reading the Iceberg docs correctly. In a vanilla Iceberg-on-Spark setup, metadata tables like snapshots, history, and files are queryable like this:

SELECT * FROM prod.db.table.snapshots;

Your query follows that pattern exactly. The error you're getting:

raw.landing.emp_ice.snapshots is not a valid identifier as it has more than 2 name parts. SQLSTATE: 42601

is the SQL engine treating that as a standard four-part identifier -- not as a reference to the Iceberg snapshots metadata table. It sees four name parts and rejects them.

Worth flagging: I haven't found documentation that confirms querying Iceberg metadata tables this way is supported in Databricks SQL today. The error behavior and the gap in the docs point in the same direction -- this likely isn't supported in this environment right now.


Here's what you can do today.

Option 1: Try DESCRIBE HISTORY first

Run:

DESCRIBE HISTORY raw.landing.emp_ice;

If it works on your Iceberg table, you'll get per-operation metadata -- timestamp, operation type, user, etc. -- similar to Delta table history. From there, time-travel works like this:

SELECT *
FROM raw.landing.emp_ice VERSION AS OF <version_number>;

where version_number comes from the version column in that history output.

One caveat: I haven't found docs that explicitly guarantee DESCRIBE HISTORY works for Iceberg tables across all Databricks runtimes, so test it in your workspace first. If you get a "not supported for Iceberg" error, this path isn't available for that table type in your environment.

Option 2: Use an external Iceberg client via Unity Catalog's Iceberg REST Catalog

If you need the full Iceberg metadata tables -- snapshots, files, manifests, all of it -- you can reach them through an external Iceberg client configured against Unity Catalog's Iceberg REST Catalog endpoint.

Once that's set up, standard Iceberg metadata queries work from the external Spark session:

SELECT * FROM <uc_catalog>.<schema>.emp_ice.snapshots;
SELECT * FROM <uc_catalog>.<schema>.emp_ice.history;
SELECT * FROM <uc_catalog>.<schema>.emp_ice.files;

You'll need Unity Catalog enabled and an Iceberg-capable client configured with spark.sql.catalog.<name>.type=rest pointing at the REST endpoint. This is more of a platform-level integration than a quick SQL change in the Databricks UI, but it gets you full access to the Iceberg metadata table experience.


Short version:

Your query follows upstream Iceberg spec, but the Databricks SQL engine isn't interpreting emp_ice.snapshots as a metadata table reference -- hence the error.

Start with DESCRIBE HISTORY and see if that covers your use case. If you need the full Iceberg metadata tables, the current path is an external Iceberg client pointed at the Unity Catalog REST Catalog endpoint.

If DESCRIBE HISTORY fails or behaves unexpectedly, grab the runtime version and full error message and take that to Databricks Support. Based on what I'm seeing in the docs and the error you're hitting, I don't have enough to claim there's another route beyond what's laid out here.

 

Hope this helps, Louis.