How to use Databricks Unity Catalog as metastore for a local spark session
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-05-2024 11:42 PM
Hello,
I would like to access Databricks Unity Catalog from a Spark session created outside the Databricks environment. Previously, I used Hive metastore and didn’t face any issues connecting in this way. Now, I’ve switched the metastore to Unity Catalog and want to connect it similarly to a local Spark session as the metastore.
The Unity Catalog documentation includes some guidance on this, and the following configuration was shared:
bin/pyspark --name "local-uc-test" \
--master "local[*]" \ --packages "io.delta:delta-spark_2.12:3.2.1,io.unitycatalog:unitycatalog-spark_2.12:0.2.0" \ --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" \ --conf "spark.sql.catalog.spark_catalog=io.unitycatalog.spark.UCSingleCatalog" \ --conf "spark.sql.catalog.unity=io.unitycatalog.spark.UCSingleCatalog" \ --conf "spark.sql.catalog.unity.uri=http://localhost:8080" \ --conf "spark.sql.catalog.unity.token=" \ --conf "spark.sql.defaultCatalog=...
However, I’m not sure how to adapt this configuration for Databricks Unity Catalog. I would appreciate your assistance on this matter.