โ01-04-2024 02:42 PM
Hi all, I am trying to Read an external Iceberg table. A separate spark sql script creates my Iceberg table and now i need to read the Iceberg tables(created outside of databricks) from my Databricks notebook. Could someone tell me the approach for that. I tried using spark.read.format("iceberg").load("s3://path to my Iceberg data folder") but getting error. Any help would be appreciated
โ01-05-2024 12:21 AM
Have you installed the jar to be able to read iceberg?
https://www.dremio.com/blog/getting-started-with-apache-iceberg-in-databricks/
You can also try to use the Uniform format, if that is possible of course.
https://docs.databricks.com/en/delta/uniform.html
โ01-05-2024 10:27 AM
Hi @-werners-
I am using Databricks Runtime 10.4 ( Spark 3.2 ), so I have downloaded โiceberg-spark-runtime-3.2_2.12โ
Also the table exists in the S3 bkt.
The error msg is: java.util.NoSuchElementException: None.get
I am also attaching a screenshot for reference.
โ01-09-2024 08:01 AM
You also need to configure the cluster, according to the blog.
If that still does not work, can you try with a recent LTS release, like 13.3 f.e.?
โ01-09-2024 08:07 AM
Hi @-werners- the cluster was provisioned with all the requirements as stated in the doc. I also tried with runtime 13.2 and corresponding Iceberg Jar, this time only the error message changed(which is more informative now) but still Databricks is not able to read the Iceberg tables in S3 with catalog as Glue catalog. The error says: AnalysisException: [TABLE_OR_VIEW_NOT_FOUND] as it is not able to read from the Glue catalog. I also provisioned the instance profile for access to Glue and S3 bucket
โ01-18-2024 01:06 PM
Hi @Retired_mod yes the iceberg table does not exist in the default catalog because its created externally(outside of Databricks) by a separate spark sql script. The catalog it uses is Glue catalog. The ques is how can i access that external iceberg table from with in my Databricks notebook
โ08-20-2024 04:39 PM
HI @Ambesh did you solve this eventually? I am getting the same error AnalysisException: [TABLE_OR_VIEW_NOT_FOUND]
โ02-24-2025 10:27 PM
To use Apache Iceberg via the Hadoop Catalog on Databricks, it was found to work with the following settings:
- Use a Databricks Runtime version of 12.2LTS or earlier.
- Set the access mode to "No isolation shared" (the mode where Unity Catalog cannot be used).
- Use a library compatible with Java 8 (i.e., an Iceberg library earlier than version 1.6.1).
- Apply the necessary Iceberg-related settings in the Spark configuration.
There is also an article (in Japanese) that explains how to resolve the errors:
- https://qiita.com/manabian/items/4c2c78c7db77f704e5ab
2 weeks ago
Hi, I'm facing the same problem.
However, when set the access mode to "No isolation shared" I loose access to the external location where the Iceberg table resides. Is there a way to force Spark to NOT use catalog even when in the "Standard (formerly Shared) access mode? I've tried setting the following option in the compute configuration:
spark.databricks.unityCatalog.enabled false
but that doesn't seem to make any difference, I'm now getting the familiar error:
NoSuchTableException: [TABLE_OR_VIEW_NOT_FOUND] The table or view ___ cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
which is off course correct as the Iceberg table isn't known to the catalog, but the problem is why does it have to be in catalog - can I not just read the Iceberg table data without having to register it in Catalog?
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now