Cant read/write tables with shared cluster

Ramonrcn · ‎02-19-2025

Hi!

I have a pipeline that i cant execute sucessfully in a shared cluster. Basically i read a query from multiple sources on my databricks instance, including streaming tables (thats the reason i have to use a shared cluster).

But when comes to the part to write the results to my hive_metastore bronze layer, i get insuficient permission errors, the first one was that the cluster did'nt have access permission enough to determine if given table existed, i got arround that error with a quick and dirty python solution:

def check_table():

try:

spark.sql(f"select * from db_bronze.{table}")

return 1

except Exception as e:

return 2

and my spark code is the following:

# Check if the table exists

exists = check_table()

if exists == 2:

windowSpec = Window.partitionBy("ID").orderBy(F.desc("ingestDatetime"))

df = df.withColumn("rank", F.row_number().over(windowSpec)) \

.filter(F.col("rank") == 1) \

.drop("rank")

df.write \

.format("delta") \

.mode("append") \

.option("path", bronze_path_full) \

.option("mergeSchema", "true") \

.saveAsTable(f"db_bronze.{table}")

else:

df.createOrReplaceTempView("vw_source")

spark.sql("""

CREATE OR REPLACE TEMP VIEW vw_latest_source AS

SELECT *

FROM (

SELECT *, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ingestDatetime DESC) as row_num

FROM vw_source

) tmp

WHERE row_num = 1

""")

spark.sql(f"""

MERGE INTO db_bronze.{table} as target

USING vw_latest_source as source

ON target.ID = source.ID

WHEN MATCHED AND source.ingestDatetime > target.ingestDatetime THEN

UPDATE SET *

WHEN NOT MATCHED THEN

INSERT *

""")

when i run this insert cell it gives this error:

org.apache.spark.SparkSecurityException: [INSUFFICIENT_PERMISSIONS] Insufficient privileges:

I've already tried:
Grant permissions via sql code in the notebook using the grant permission to user, grant ownership, grant all-files permission but i cant work arround this issue.

My user has all the permissions to create, modify, delete tables in hive_metastore and other catalogs but i suspect it is a cluster level permission issue.
Anyone faced the same situation?