Databricks Community

Zume · ‎06-17-2024

Am I the only one experiencing challenges in migrating to Databricks Unity Catalog? I observed that in Unity Catalog-enabled compute, the "Shared" access mode is still tagged as a Preview feature. This means it is not yet safe for use in production workloads. Having a compute resource that can be shared in production is crucial because various developers and service principals need to be able to execute queries on the cluster. I'm wondering how others are working around this issue since it is a major blocker to effectively migrating all workloads to Unity Catalog.

Additionally, when I tested my code using the Shared access mode compute, I noticed that it gets stuck when trying to read a file stored in an external location into a data frame..
Watch this video for the demo of the issue https://www.youtube.com/watch?v=J1bn6P7elKI&ab_channel=AfroInfoTech

jacovangelder · ‎06-26-2024

Have you tried creating a volume on top of the external location, and using the volume in spark.read.parquet?

i.e.

spark.read.parquet('/Volumes/<volume_name>/<folder_name>/<file_name.parquet>')

Edit: also, not sure why the Databricks community manager here said Shared access mode is "in preview" and that its "not recommended for production workloads", because this is completely false. It is not in preview and completely safe for production workloads. It has been for almost 2 years. The only thing in preview for shared access mode clusters right now are scala workloads.

Databricks Community

Unity Catalog Shared compute Issues

Connect with Databricks Users in Your Area

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences

Databricks Community Champion - December 2024 - Sujesh Menon