I have an external location setup "auth_kafka" which is mapped to an abfss url:
abfss://{container}@{account}.dfs.core.windows.net/auth/kafka
and, critically, is marked as readonly.
Using dbutils.fs I can successfully read the files (i.e. the ls and head function calls to files in that location all work), but I cannot run dbutils.fs.cp to copy files from there to dbfs, as follows:
dbutils.fs.cp("abfss://{container}@{account}.dfs.core.windows.net/auth/kafka/client.truststore.jks", "dbfs:/FileStore/Certs/client.truststore.jks")
This results in the following error (PERMISSION_DENIED: User does not have WRITE FILES on External Location 'auth_kafka'.)
ExecutionError: An error occurred while calling o548.cp.
: java.io.IOException: Server-side copy has failed. Please try disabling it through `databricks.spark.dbutils.fs.cp.server-side.enabled`
at com.databricks.backend.daemon.dbutils.FSUtils.cpRecursive(DBUtilsCore.scala:400)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$3(DBUtilsCore.scala:336)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$withCpSafetyChecks$2(DBUtilsCore.scala:160)
at com.databricks.backend.daemon.dbutils.FSUtils.withFsSafetyCheck(DBUtilsCore.scala:145)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$withCpSafetyChecks$1(DBUtilsCore.scala:152)
at com.databricks.backend.daemon.dbutils.FSUtils.withFsSafetyCheck(DBUtilsCore.scala:145)
at com.databricks.backend.daemon.dbutils.FSUtils.withCpSafetyChecks(DBUtilsCore.scala:152)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$2(DBUtilsCore.scala:333)
at com.databricks.backend.daemon.dbutils.FSUtils.checkPermission(DBUtilsCore.scala:140)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$1(DBUtilsCore.scala:333)
at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:571)
at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:666)
at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:684)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:426)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:196)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:424)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:418)
at com.databricks.backend.daemon.dbutils.FSUtils.withAttributionContext(DBUtilsCore.scala:69)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:470)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:455)
at com.databricks.backend.daemon.dbutils.FSUtils.withAttributionTags(DBUtilsCore.scala:69)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:661)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:580)
at com.databricks.backend.daemon.dbutils.FSUtils.recordOperationWithResultTags(DBUtilsCore.scala:69)
at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:571)
at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:540)
at com.databricks.backend.daemon.dbutils.FSUtils.recordOperation(DBUtilsCore.scala:69)
at com.databricks.backend.daemon.dbutils.FSUtils.recordDbutilsFsOp(DBUtilsCore.scala:133)
at com.databricks.backend.daemon.dbutils.FSUtils.cp(DBUtilsCore.scala:332)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
at py4j.Gateway.invoke(Gateway.java:306)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)
at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.databricks.sql.managedcatalog.acl.UnauthorizedAccessException: PERMISSION_DENIED: User does not have WRITE FILES on External Location 'auth_kafka'.
at com.databricks.managedcatalog.UCReliableHttpClient.reliablyAndTranslateExceptions(UCReliableHttpClient.scala:47)
at com.databricks.managedcatalog.UCReliableHttpClient.postJson(UCReliableHttpClient.scala:63)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$generateTemporaryPathCredentials$1(ManagedCatalogClientImpl.scala:3262)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$recordAndWrapException$2(ManagedCatalogClientImpl.scala:3674)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$recordAndWrapException$1(ManagedCatalogClientImpl.scala:3673)
at com.databricks.managedcatalog.ErrorDetailsHandler.wrapServiceException(ErrorDetailsHandler.scala:25)
at com.databricks.managedcatalog.ErrorDetailsHandler.wrapServiceException$(ErrorDetailsHandler.scala:23)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.wrapServiceException(ManagedCatalogClientImpl.scala:139)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.recordAndWrapException(ManagedCatalogClientImpl.scala:3670)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.generateTemporaryPathCredentials(ManagedCatalogClientImpl.scala:3253)
at com.databricks.sql.managedcatalog.ManagedCatalogCommon.generateTemporaryPathCredentials(ManagedCatalogCommon.scala:1456)
at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.$anonfun$generateTemporaryPathCredentials$2(ProfiledManagedCatalog.scala:564)
at org.apache.spark.sql.catalyst.MetricKeyUtils$.measure(MetricKey.scala:319)
at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.$anonfun$profile$1(ProfiledManagedCatalog.scala:55)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.profile(ProfiledManagedCatalog.scala:54)
at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.generateTemporaryPathCredentials(ProfiledManagedCatalog.scala:564)
at com.databricks.unity.CredentialScopeSQLHelper$.checkPathOperations(CredentialScopeSQLHelper.scala:95)
at com.databricks.unity.CredentialScopeSQLHelper$.registerExternalLocationPath(CredentialScopeSQLHelper.scala:197)
at com.databricks.unity.CredentialScopeSQLHelper$.register(CredentialScopeSQLHelper.scala:154)
at com.databricks.unity.CredentialScopeSQLHelper$.registerPathAccess(CredentialScopeSQLHelper.scala:443)
at com.databricks.backend.daemon.dbutils.ExternalLocationHelper$.$anonfun$registerPaths$1(ExternalLocationHelper.scala:48)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
at com.databricks.backend.daemon.dbutils.ExternalLocationHelper$.registerPaths(ExternalLocationHelper.scala:41)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$3(DBUtilsCore.scala:334)
... 40 more
That particilar error relates to the user not having write permissions on the external location...I can add a grant to give the user WRITE FILES on it, but it just defers the issue to the next tier, resulting in the error:
Caused by: com.databricks.sql.managedcatalog.acl.UnauthorizedAccessException: PERMISSION_DENIED: User cannot write to a read-only external location auth_kafka
since the external location is flagged as readonly.
This is weird to me - why is write access needed to copy from the abfss location? Surely it only needs to read it? I can confirm that opening up the permissions on the external location to allow writes resolves the issue...but that kinda defeats the purpose?