<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: UDF java can't access files in Unity Catalog - Operation not permitted in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/105034#M41973</link>
    <description>&lt;P&gt;looks like it's related to Databricks security, don't allow to access file from constructor ,only allow from the call() method.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="java"&gt;public class SomeUDF implements  UDF2&amp;lt;String, String,String&amp;gt; {
    
     public SomeUDF(){
     
       try {

            List&amp;lt;String&amp;gt; lines = Files.readAllLines(Paths.get(path)); //&amp;lt;--operation not allowed
            for (String line : lines) {

                log.info("line from file: " + line);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
     
     }


    @Override
    public String call(String path, String topic) throws Exception {

        try {

            List&amp;lt;String&amp;gt; lines = Files.readAllLines(Paths.get(path)); //works
            for (String line : lines) {

                log.info("line from file: " + line);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

        return "Hello";


    }&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;any suggestion? as I must run init code on load that reads file content that in catalog.&lt;/P&gt;</description>
    <pubDate>Thu, 09 Jan 2025 18:15:30 GMT</pubDate>
    <dc:creator>yevsh</dc:creator>
    <dc:date>2025-01-09T18:15:30Z</dc:date>
    <item>
      <title>UDF java can't access files in Unity Catalog - Operation not permitted</title>
      <link>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/104584#M41807</link>
      <description>&lt;P&gt;I am using Databricks on Azure.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;in pyspark I register UDF java function&lt;/P&gt;&lt;P&gt;spark.udf.registerJavaFunction("foo", "com.foo.Foo", T.StringType())&lt;BR /&gt;Foo tries to load a file,&amp;nbsp; using Files.readAllLines(), located in the Databricks unity catalog .&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;stderr log:&lt;/P&gt;&lt;P&gt;Tue Jan 7 12:36:33 2025 Connection to spark from PID 1908&lt;BR /&gt;Tue Jan 7 12:36:33 2025 Initialized gateway on port 33693&lt;BR /&gt;Tue Jan 7 12:36:34 2025 Connected to spark.&lt;BR /&gt;2025/01/07 12:36:39 WARNING mlflow.utils.autologging_utils: You are using an unsupported version of langchain. If you encounter errors during autologging, try upgrading / downgrading langchain to a supported version, or try upgrading MLflow.&lt;BR /&gt;2025/01/07 12:36:41 WARNING mlflow.utils.autologging_utils: You are using an unsupported version of openai. If you encounter errors during autologging, try upgrading / downgrading openai to a supported version, or try upgrading MLflow.&lt;BR /&gt;java.nio.file.FileSystemException: /Volumes/xxxx_volume/config/foo.yaml: Operation not permitted&lt;BR /&gt;at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)&lt;BR /&gt;at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)&lt;BR /&gt;at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)&lt;BR /&gt;at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:218)&lt;BR /&gt;at java.base/java.nio.file.Files.newByteChannel(Files.java:380)&lt;BR /&gt;at java.base/java.nio.file.Files.newByteChannel(Files.java:432)&lt;BR /&gt;at java.base/java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:422)&lt;BR /&gt;at java.base/java.nio.file.Files.newInputStream(Files.java:160)&lt;BR /&gt;at java.base/java.nio.file.Files.newBufferedReader(Files.java:2922)&lt;BR /&gt;at java.base/java.nio.file.Files.readAllLines(Files.java:3412)&lt;BR /&gt;at java.base/java.nio.file.Files.readAllLines(Files.java:3453)&lt;BR /&gt;at XXXXXXXXXXXXXXXXXXXXXXXXXXXXx&lt;BR /&gt;at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)&lt;BR /&gt;at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)&lt;BR /&gt;at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)&lt;BR /&gt;at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)&lt;BR /&gt;at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)&lt;BR /&gt;at org.apache.spark.sql.UDFRegistration.registerJava(UDFRegistration.scala:696)&lt;BR /&gt;at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;BR /&gt;at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)&lt;BR /&gt;at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;BR /&gt;at java.base/java.lang.reflect.Method.invoke(Method.java:569)&lt;BR /&gt;at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)&lt;BR /&gt;at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)&lt;BR /&gt;at py4j.Gateway.invoke(Gateway.java:306)&lt;BR /&gt;at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)&lt;BR /&gt;at py4j.commands.CallCommand.execute(CallCommand.java:79)&lt;BR /&gt;at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)&lt;BR /&gt;at py4j.ClientServerConnection.run(ClientServerConnection.java:119)&lt;BR /&gt;at java.base/java.lang.Thread.run(Thread.java:840)&lt;/P&gt;&lt;P&gt;python itself can access the file. Seems issue only when java access it.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;in terminal of databricks i see the file as:&lt;/P&gt;&lt;P&gt;-rwxrwxrwx 1 nobody nogroup&amp;nbsp; &amp;nbsp;512 Jan&amp;nbsp; 7 07:57 foo.yaml&lt;BR /&gt;Custer created from the Compute pool.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jan 2025 18:36:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/104584#M41807</guid>
      <dc:creator>yevsh</dc:creator>
      <dc:date>2025-01-07T18:36:30Z</dc:date>
    </item>
    <item>
      <title>Re: UDF java can't access files in Unity Catalog - Operation not permitted</title>
      <link>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/104587#M41810</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;STRONG&gt;Check File Permissions&lt;/STRONG&gt;: Ensure thatthe file &lt;CODE&gt;foo.yaml&lt;/CODE&gt; has the correct permissions set for the user running the Java process. The file should be accessible by the user under which the Java process is running. You can check the file permissions using the &lt;CODE&gt;ls -l&lt;/CODE&gt; command in the terminal.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;STRONG&gt;Check Mount Options&lt;/STRONG&gt;: Verify that the volume &lt;CODE&gt;/Volumes/xxxx_volume&lt;/CODE&gt; is mounted with the correct options that allow Java to access the files. Sometimes, volumes mounted with certain options might restrict access to specific users or processes.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;STRONG&gt;Run Java Process with Elevated Privileges&lt;/STRONG&gt;: If possible, try running the Java process with elevated privileges (e.g., using &lt;CODE&gt;sudo&lt;/CODE&gt;) to see if it resolves the permission issue. However, this should be done with caution and only if it is safe and appropriate for your environment.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jan 2025 18:54:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/104587#M41810</guid>
      <dc:creator>Walter_C</dc:creator>
      <dc:date>2025-01-07T18:54:02Z</dc:date>
    </item>
    <item>
      <title>Re: UDF java can't access files in Unity Catalog - Operation not permitted</title>
      <link>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/104647#M41827</link>
      <description>&lt;P&gt;in the post i wrote the current file permission.&lt;BR /&gt;&lt;BR /&gt;there are no any (explicit) mounts - what exactly and where should it be defined?&lt;BR /&gt;&lt;BR /&gt;I am not running Java process explicitly, as was stated in post , i only do&amp;nbsp;&lt;BR /&gt;&lt;SPAN&gt;spark.udf.registerJavaFunction("foo", "com.foo.Foo", T.StringType())&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jan 2025 06:29:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/104647#M41827</guid>
      <dc:creator>yevsh</dc:creator>
      <dc:date>2025-01-08T06:29:53Z</dc:date>
    </item>
    <item>
      <title>Re: UDF java can't access files in Unity Catalog - Operation not permitted</title>
      <link>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/105034#M41973</link>
      <description>&lt;P&gt;looks like it's related to Databricks security, don't allow to access file from constructor ,only allow from the call() method.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="java"&gt;public class SomeUDF implements  UDF2&amp;lt;String, String,String&amp;gt; {
    
     public SomeUDF(){
     
       try {

            List&amp;lt;String&amp;gt; lines = Files.readAllLines(Paths.get(path)); //&amp;lt;--operation not allowed
            for (String line : lines) {

                log.info("line from file: " + line);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
     
     }


    @Override
    public String call(String path, String topic) throws Exception {

        try {

            List&amp;lt;String&amp;gt; lines = Files.readAllLines(Paths.get(path)); //works
            for (String line : lines) {

                log.info("line from file: " + line);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

        return "Hello";


    }&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;any suggestion? as I must run init code on load that reads file content that in catalog.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jan 2025 18:15:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/105034#M41973</guid>
      <dc:creator>yevsh</dc:creator>
      <dc:date>2025-01-09T18:15:30Z</dc:date>
    </item>
    <item>
      <title>Re: UDF java can't access files in Unity Catalog - Operation not permitted</title>
      <link>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/105435#M42123</link>
      <description>&lt;P&gt;To address the issue of needing to run initialization code that reads file content during the load of a UDF (User Defined Function) in Databricks, you should avoid performing file operations in the constructor due to security restrictions. Instead, you can use a static block or a singleton pattern to ensure the initialization code runs only once when the class is loaded&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jan 2025 14:40:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/udf-java-can-t-access-files-in-unity-catalog-operation-not/m-p/105435#M42123</guid>
      <dc:creator>Walter_C</dc:creator>
      <dc:date>2025-01-13T14:40:14Z</dc:date>
    </item>
  </channel>
</rss>

