How to get the hadoopConfiguration in a unity catalog standard access mode app ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Context:
- job running using a job clustered configured in Standard access mode ( Shared Access mode )
- scala 2.12.15 / spark 3.5.0 jar program
- Databricks runtime 15.4 LTS
In this context, it is not possible to get the sparkSession.sparkContext, as confirmed in this page (https://learn.microsoft.com/en-us/azure/databricks/compute/access-mode-limitations#shared-limitation...).
How can i retrieve sparkSession.sparkContext.hadoopConfiguration ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
In Unity Catalog standard access mode (formerly shared access mode) with Databricks Runtime 15.4 LTS, direct access to `sparkSession.sparkContext` is restricted as part of the security limitations. However, there are still ways to access the Hadoop configuration.
Accessing Hadoop Configuration in Standard Access Mode
Since Databricks Runtime 15.4 LTS includes improvements for Scala support in standard access mode, you have a few options:
1. Use the `spark` variable directly to access configuration:
```scala
val hadoopConfig = spark.sessionState.newHadoopConf()
```
2. For specific Hadoop configuration properties, you can use:
```scala
val configValue = spark.conf.get("spark.hadoop.[property-name]")
```
3. If you need to set Hadoop configurations, you can use:
```scala
spark.conf.set("spark.hadoop.[property-name]", "value")
```
These approaches work within the security constraints of standard access mode while still giving you access to the Hadoop configuration properties you need.
Remember that in standard access mode, Unity Catalog doesn't respect cluster configurations for filesystem settings when accessing data through Unity Catalog. If you need to access cloud storage, you should use an external location configured in Unity Catalog rather than trying to configure Hadoop filesystem settings directl.
For Scala applications in Databricks Runtime 15.4 LTS, many of the previous limitations have been lifted, and most Dataset API operations are now supported[6], making it easier to work with data in standard access mode.

