cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Why Shared Access Mode for Unity Catalog enabled DLT pipeline?

hayden_blair
New Contributor III

Hello all,

I am trying to use an RDD API in a Unity Catalog enabled Delta Live Tables pipeline.

I am getting an error because Unity Catalog enabled DLT can only run on "shared access mode" compute, and RDD APIs are not supported on shared access compute for security reasons.

Is there a reason that Unity Catalog enabled DLT is shared access mode only? Can we expect to see Unity Catalog enabled DLT pipelines that can run on single user compute anytime soon?

Unity Catalog with DLT doc (says shared compute only)

Compute access mode restrictions for Unity Catalog (says no RDD API on shared access mode)

Thank you!

Hayden

2 REPLIES 2

Slash
Contributor

Hi @hayden_blair ,

The error you are encountering is related to Py4J security settings in Apache Spark. In Shared access mode, Py4J security is enabled by default for security reasons, which restricts certain methods from being called on the Spark RDD object.

To put it simply, security around Unity Catalog is strict but necessary 🙂

hayden_blair
New Contributor III

Thank you for the response @Slash. Do you know if single user clusters are inherently less secure? I am still curious about why single user access mode is not allowed for DLT + Unity Catalog.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group