Cant read/write tables with shared cluster
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-19-2025 10:14 AM - edited 02-19-2025 10:19 AM
Hi!
I have a pipeline that i cant execute sucessfully in a shared cluster. Basically i read a query from multiple sources on my databricks instance, including streaming tables (thats the reason i have to use a shared cluster).
But when comes to the part to write the results to my hive_metastore bronze layer, i get insuficient permission errors, the first one was that the cluster did'nt have access permission enough to determine if given table existed, i got arround that error with a quick and dirty python solution:
def check_table():
try:
spark.sql(f"select * from db_bronze.{table}")
return 1
except Exception as e:
return 2
and my spark code is the following:
# Check if the table exists
exists = check_table()
if exists == 2:
windowSpec = Window.partitionBy("ID").orderBy(F.desc("ingestDatetime"))
df = df.withColumn("rank", F.row_number().over(windowSpec)) \
.filter(F.col("rank") == 1) \
.drop("rank")
df.write \
.format("delta") \
.mode("append") \
.option("path", bronze_path_full) \
.option("mergeSchema", "true") \
.saveAsTable(f"db_bronze.{table}")
else:
df.createOrReplaceTempView("vw_source")
spark.sql("""
CREATE OR REPLACE TEMP VIEW vw_latest_source AS
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ingestDatetime DESC) as row_num
FROM vw_source
) tmp
WHERE row_num = 1
""")
spark.sql(f"""
MERGE INTO db_bronze.{table} as target
USING vw_latest_source as source
ON target.ID = source.ID
WHEN MATCHED AND source.ingestDatetime > target.ingestDatetime THEN
UPDATE SET *
WHEN NOT MATCHED THEN
INSERT *
""")
when i run this insert cell it gives this error:
org.apache.spark.SparkSecurityException: [INSUFFICIENT_PERMISSIONS] Insufficient privileges:
I've already tried:
Grant permissions via sql code in the notebook using the grant permission to user, grant ownership, grant all-files permission but i cant work arround this issue.
My user has all the permissions to create, modify, delete tables in hive_metastore and other catalogs but i suspect it is a cluster level permission issue.
Anyone faced the same situation?
Grant permissions via sql code in the notebook using the grant permission to user, grant ownership, grant all-files permission but i cant work arround this issue.
My user has all the permissions to create, modify, delete tables in hive_metastore and other catalogs but i suspect it is a cluster level permission issue.
Anyone faced the same situation?