dbutils.fs.cp requires write permissions on the source
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-20-2023 02:12 AM
I have an external location setup "auth_kafka" which is mapped to an abfss url:
and, critically, is marked as readonly.
Using dbutils.fs I can successfully read the files (i.e. the ls and head function calls to files in that location all work), but I cannot run dbutils.fs.cp to copy files from there to dbfs, as follows:
This results in the following error (PERMISSION_DENIED: User does not have WRITE FILES on External Location 'auth_kafka'.)
ExecutionError: An error occurred while calling o548.cp.
: java.io.IOException: Server-side copy has failed. Please try disabling it through `databricks.spark.dbutils.fs.cp.server-side.enabled`
at com.databricks.backend.daemon.dbutils.FSUtils.cpRecursive(DBUtilsCore.scala:400)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$3(DBUtilsCore.scala:336)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$withCpSafetyChecks$2(DBUtilsCore.scala:160)
at com.databricks.backend.daemon.dbutils.FSUtils.withFsSafetyCheck(DBUtilsCore.scala:145)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$withCpSafetyChecks$1(DBUtilsCore.scala:152)
at com.databricks.backend.daemon.dbutils.FSUtils.withFsSafetyCheck(DBUtilsCore.scala:145)
at com.databricks.backend.daemon.dbutils.FSUtils.withCpSafetyChecks(DBUtilsCore.scala:152)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$2(DBUtilsCore.scala:333)
at com.databricks.backend.daemon.dbutils.FSUtils.checkPermission(DBUtilsCore.scala:140)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$1(DBUtilsCore.scala:333)
at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:571)
at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:666)
at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:684)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:426)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:196)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:424)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:418)
at com.databricks.backend.daemon.dbutils.FSUtils.withAttributionContext(DBUtilsCore.scala:69)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:470)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:455)
at com.databricks.backend.daemon.dbutils.FSUtils.withAttributionTags(DBUtilsCore.scala:69)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:661)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:580)
at com.databricks.backend.daemon.dbutils.FSUtils.recordOperationWithResultTags(DBUtilsCore.scala:69)
at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:571)
at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:540)
at com.databricks.backend.daemon.dbutils.FSUtils.recordOperation(DBUtilsCore.scala:69)
at com.databricks.backend.daemon.dbutils.FSUtils.recordDbutilsFsOp(DBUtilsCore.scala:133)
at com.databricks.backend.daemon.dbutils.FSUtils.cp(DBUtilsCore.scala:332)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
at py4j.Gateway.invoke(Gateway.java:306)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)
at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.databricks.sql.managedcatalog.acl.UnauthorizedAccessException: PERMISSION_DENIED: User does not have WRITE FILES on External Location 'auth_kafka'.
at com.databricks.managedcatalog.UCReliableHttpClient.reliablyAndTranslateExceptions(UCReliableHttpClient.scala:47)
at com.databricks.managedcatalog.UCReliableHttpClient.postJson(UCReliableHttpClient.scala:63)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$generateTemporaryPathCredentials$1(ManagedCatalogClientImpl.scala:3262)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$recordAndWrapException$2(ManagedCatalogClientImpl.scala:3674)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$recordAndWrapException$1(ManagedCatalogClientImpl.scala:3673)
at com.databricks.managedcatalog.ErrorDetailsHandler.wrapServiceException(ErrorDetailsHandler.scala:25)
at com.databricks.managedcatalog.ErrorDetailsHandler.wrapServiceException$(ErrorDetailsHandler.scala:23)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.wrapServiceException(ManagedCatalogClientImpl.scala:139)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.recordAndWrapException(ManagedCatalogClientImpl.scala:3670)
at com.databricks.managedcatalog.ManagedCatalogClientImpl.generateTemporaryPathCredentials(ManagedCatalogClientImpl.scala:3253)
at com.databricks.sql.managedcatalog.ManagedCatalogCommon.generateTemporaryPathCredentials(ManagedCatalogCommon.scala:1456)
at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.$anonfun$generateTemporaryPathCredentials$2(ProfiledManagedCatalog.scala:564)
at org.apache.spark.sql.catalyst.MetricKeyUtils$.measure(MetricKey.scala:319)
at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.$anonfun$profile$1(ProfiledManagedCatalog.scala:55)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.profile(ProfiledManagedCatalog.scala:54)
at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.generateTemporaryPathCredentials(ProfiledManagedCatalog.scala:564)
at com.databricks.unity.CredentialScopeSQLHelper$.checkPathOperations(CredentialScopeSQLHelper.scala:95)
at com.databricks.unity.CredentialScopeSQLHelper$.registerExternalLocationPath(CredentialScopeSQLHelper.scala:197)
at com.databricks.unity.CredentialScopeSQLHelper$.register(CredentialScopeSQLHelper.scala:154)
at com.databricks.unity.CredentialScopeSQLHelper$.registerPathAccess(CredentialScopeSQLHelper.scala:443)
at com.databricks.backend.daemon.dbutils.ExternalLocationHelper$.$anonfun$registerPaths$1(ExternalLocationHelper.scala:48)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
at com.databricks.backend.daemon.dbutils.ExternalLocationHelper$.registerPaths(ExternalLocationHelper.scala:41)
at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$3(DBUtilsCore.scala:334)
... 40 more
That particilar error relates to the user not having write permissions on the external location...I can add a grant to give the user WRITE FILES on it, but it just defers the issue to the next tier, resulting in the error:
Caused by: com.databricks.sql.managedcatalog.acl.UnauthorizedAccessException: PERMISSION_DENIED: User cannot write to a read-only external location auth_kafka
since the external location is flagged as readonly.
This is weird to me - why is write access needed to copy from the abfss location? Surely it only needs to read it? I can confirm that opening up the permissions on the external location to allow writes resolves the issue...but that kinda defeats the purpose?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-20-2023 06:07 AM - edited 09-20-2023 06:08 AM
@Retired_mod Sorry...not sure how that's relevant? Was that posted to the wrong topic?
This question is in regards to what appears to be a bug in dbutils.fs where the cp function appears to require write access to the data source (as opposed to just read access), i.e. write access should only be necessary on the destination.

