<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: dbutils.fs.cp requires write permissions on the source in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/dbutils-fs-cp-requires-write-permissions-on-the-source/m-p/45446#M27884</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;Sorry...not sure how that's relevant? Was that posted to the wrong topic?&lt;/P&gt;&lt;P&gt;This question is in regards to what appears to be a bug in dbutils.fs where the cp function appears to require write access to the data source (as opposed to just read access), i.e. write access should only be necessary on the destination.&lt;/P&gt;</description>
    <pubDate>Wed, 20 Sep 2023 13:08:34 GMT</pubDate>
    <dc:creator>mwoods</dc:creator>
    <dc:date>2023-09-20T13:08:34Z</dc:date>
    <item>
      <title>dbutils.fs.cp requires write permissions on the source</title>
      <link>https://community.databricks.com/t5/data-engineering/dbutils-fs-cp-requires-write-permissions-on-the-source/m-p/45414#M27876</link>
      <description>&lt;P&gt;I have an external location setup "auth_kafka" which is mapped to an abfss url:&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;FONT face="andale mono,times"&gt;&lt;SPAN&gt;abfss://{container}@{account}.dfs.core.windows.net/auth/kafka&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;SPAN&gt;and, critically, is marked as readonly.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Using dbutils.fs I can successfully read the files (i.e. the ls and head function calls to files in that location all work), but I cannot run dbutils.fs.cp to copy files from there to dbfs, as follows:&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;FONT face="andale mono,times"&gt;&lt;SPAN&gt;dbutils.fs.cp(&lt;/SPAN&gt;&lt;SPAN&gt;"abfss://{container}@{account}.dfs.core.windows.net/auth/kafka/client.truststore.jks"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"dbfs:/FileStore/Certs/client.truststore.jks"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;This results in the following error (PERMISSION_DENIED: User does not have WRITE FILES on External Location 'auth_kafka'.)&lt;/P&gt;&lt;PRE&gt;ExecutionError: An error occurred while calling o548.cp.&lt;BR /&gt;: java.io.IOException: Server-side copy has failed. Please try disabling it through `databricks.spark.dbutils.fs.cp.server-side.enabled`&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.cpRecursive(DBUtilsCore.scala:400)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$3(DBUtilsCore.scala:336)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$withCpSafetyChecks$2(DBUtilsCore.scala:160)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.withFsSafetyCheck(DBUtilsCore.scala:145)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$withCpSafetyChecks$1(DBUtilsCore.scala:152)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.withFsSafetyCheck(DBUtilsCore.scala:145)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.withCpSafetyChecks(DBUtilsCore.scala:152)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$2(DBUtilsCore.scala:333)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.checkPermission(DBUtilsCore.scala:140)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$1(DBUtilsCore.scala:333)&lt;BR /&gt;at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:571)&lt;BR /&gt;at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:666)&lt;BR /&gt;at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:684)&lt;BR /&gt;at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:426)&lt;BR /&gt;at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)&lt;BR /&gt;at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:196)&lt;BR /&gt;at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:424)&lt;BR /&gt;at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:418)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.withAttributionContext(DBUtilsCore.scala:69)&lt;BR /&gt;at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:470)&lt;BR /&gt;at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:455)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.withAttributionTags(DBUtilsCore.scala:69)&lt;BR /&gt;at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:661)&lt;BR /&gt;at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:580)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.recordOperationWithResultTags(DBUtilsCore.scala:69)&lt;BR /&gt;at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:571)&lt;BR /&gt;at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:540)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.recordOperation(DBUtilsCore.scala:69)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.recordDbutilsFsOp(DBUtilsCore.scala:133)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.cp(DBUtilsCore.scala:332)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)&lt;BR /&gt;at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;BR /&gt;at java.lang.reflect.Method.invoke(Method.java:498)&lt;BR /&gt;at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)&lt;BR /&gt;at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)&lt;BR /&gt;at py4j.Gateway.invoke(Gateway.java:306)&lt;BR /&gt;at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)&lt;BR /&gt;at py4j.commands.CallCommand.execute(CallCommand.java:79)&lt;BR /&gt;at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)&lt;BR /&gt;at py4j.ClientServerConnection.run(ClientServerConnection.java:115)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:750)&lt;BR /&gt;Caused by: com.databricks.sql.managedcatalog.acl.UnauthorizedAccessException: PERMISSION_DENIED: User does not have WRITE FILES on External Location 'auth_kafka'.&lt;BR /&gt;at com.databricks.managedcatalog.UCReliableHttpClient.reliablyAndTranslateExceptions(UCReliableHttpClient.scala:47)&lt;BR /&gt;at com.databricks.managedcatalog.UCReliableHttpClient.postJson(UCReliableHttpClient.scala:63)&lt;BR /&gt;at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$generateTemporaryPathCredentials$1(ManagedCatalogClientImpl.scala:3262)&lt;BR /&gt;at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$recordAndWrapException$2(ManagedCatalogClientImpl.scala:3674)&lt;BR /&gt;at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)&lt;BR /&gt;at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$recordAndWrapException$1(ManagedCatalogClientImpl.scala:3673)&lt;BR /&gt;at com.databricks.managedcatalog.ErrorDetailsHandler.wrapServiceException(ErrorDetailsHandler.scala:25)&lt;BR /&gt;at com.databricks.managedcatalog.ErrorDetailsHandler.wrapServiceException$(ErrorDetailsHandler.scala:23)&lt;BR /&gt;at com.databricks.managedcatalog.ManagedCatalogClientImpl.wrapServiceException(ManagedCatalogClientImpl.scala:139)&lt;BR /&gt;at com.databricks.managedcatalog.ManagedCatalogClientImpl.recordAndWrapException(ManagedCatalogClientImpl.scala:3670)&lt;BR /&gt;at com.databricks.managedcatalog.ManagedCatalogClientImpl.generateTemporaryPathCredentials(ManagedCatalogClientImpl.scala:3253)&lt;BR /&gt;at com.databricks.sql.managedcatalog.ManagedCatalogCommon.generateTemporaryPathCredentials(ManagedCatalogCommon.scala:1456)&lt;BR /&gt;at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.$anonfun$generateTemporaryPathCredentials$2(ProfiledManagedCatalog.scala:564)&lt;BR /&gt;at org.apache.spark.sql.catalyst.MetricKeyUtils$.measure(MetricKey.scala:319)&lt;BR /&gt;at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.$anonfun$profile$1(ProfiledManagedCatalog.scala:55)&lt;BR /&gt;at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)&lt;BR /&gt;at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.profile(ProfiledManagedCatalog.scala:54)&lt;BR /&gt;at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.generateTemporaryPathCredentials(ProfiledManagedCatalog.scala:564)&lt;BR /&gt;at com.databricks.unity.CredentialScopeSQLHelper$.checkPathOperations(CredentialScopeSQLHelper.scala:95)&lt;BR /&gt;at com.databricks.unity.CredentialScopeSQLHelper$.registerExternalLocationPath(CredentialScopeSQLHelper.scala:197)&lt;BR /&gt;at com.databricks.unity.CredentialScopeSQLHelper$.register(CredentialScopeSQLHelper.scala:154)&lt;BR /&gt;at com.databricks.unity.CredentialScopeSQLHelper$.registerPathAccess(CredentialScopeSQLHelper.scala:443)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.ExternalLocationHelper$.$anonfun$registerPaths$1(ExternalLocationHelper.scala:48)&lt;BR /&gt;at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)&lt;BR /&gt;at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)&lt;BR /&gt;at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.ExternalLocationHelper$.registerPaths(ExternalLocationHelper.scala:41)&lt;BR /&gt;at com.databricks.backend.daemon.dbutils.FSUtils.$anonfun$cp$3(DBUtilsCore.scala:334)&lt;BR /&gt;... 40 more&lt;/PRE&gt;&lt;P&gt;That particilar error relates to the user not having write permissions on the external location...I can add a grant to give the user WRITE FILES on it, but it just defers the issue to the next tier, resulting in the error:&lt;/P&gt;&lt;PRE&gt;Caused by: com.databricks.sql.managedcatalog.acl.UnauthorizedAccessException: PERMISSION_DENIED: User cannot write to a read-only external location auth_kafka&lt;/PRE&gt;&lt;P&gt;since the external location is flagged as readonly.&lt;/P&gt;&lt;P&gt;This is weird to me - why is write access needed to copy &lt;EM&gt;from&lt;/EM&gt; the abfss location? Surely it only needs to read it? I can confirm that opening up the permissions on the external location to allow writes resolves the issue...but that kinda defeats the purpose?&lt;/P&gt;</description>
      <pubDate>Wed, 20 Sep 2023 09:12:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dbutils-fs-cp-requires-write-permissions-on-the-source/m-p/45414#M27876</guid>
      <dc:creator>mwoods</dc:creator>
      <dc:date>2023-09-20T09:12:51Z</dc:date>
    </item>
    <item>
      <title>Re: dbutils.fs.cp requires write permissions on the source</title>
      <link>https://community.databricks.com/t5/data-engineering/dbutils-fs-cp-requires-write-permissions-on-the-source/m-p/45446#M27884</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;Sorry...not sure how that's relevant? Was that posted to the wrong topic?&lt;/P&gt;&lt;P&gt;This question is in regards to what appears to be a bug in dbutils.fs where the cp function appears to require write access to the data source (as opposed to just read access), i.e. write access should only be necessary on the destination.&lt;/P&gt;</description>
      <pubDate>Wed, 20 Sep 2023 13:08:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dbutils-fs-cp-requires-write-permissions-on-the-source/m-p/45446#M27884</guid>
      <dc:creator>mwoods</dc:creator>
      <dc:date>2023-09-20T13:08:34Z</dc:date>
    </item>
  </channel>
</rss>

