<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Using shared python wheels for job compute clusters in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23947#M16617</link>
    <description>&lt;P&gt;Yeah, it was an authentication issue. Turns out the compute clusters were set up with instance profiles, but never the job profiles, so when the wheel process was set up it failed for jobs.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;TL;DR: you need to apply instance profiles for access to shared resources on the compute cluster.&lt;/P&gt;</description>
    <pubDate>Sat, 02 Apr 2022 16:42:19 GMT</pubDate>
    <dc:creator>Mr__E</dc:creator>
    <dc:date>2022-04-02T16:42:19Z</dc:date>
    <item>
      <title>Using shared python wheels for job compute clusters</title>
      <link>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23943#M16613</link>
      <description>&lt;P&gt;We have a GitHub workflow that generates a python wheel and uploads to a shared S3 available to our Databricks workspaces. When I install the Python wheel to a normal compute cluster using the path approach, it correctly installs the Python wheel and I can use the library. However, when I install to a job compute cluster, I receive the following error:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;Run result unavailable: job failed with error message Library installation failed for library due to user error for whl: "s3://shared-python-packages/mywheel-0.0.latest-py3-none-any.whl" . Error messages: java.lang.RuntimeException: ManagedLibraryInstallFailed: java.util.concurrent.ExecutionException: java.nio.file.AccessDeniedException: s3a://shared-python-packages/mywheel-0.0.latest-py3-none-any.whl: getFileStatus on s3a://shared-python-packages/mywheel-0.0.latest-py3-none-any.whl: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden; request: HEAD &lt;A href="https://shared-python-packages.s3-us-west-2.amazonaws.com" target="test_blank"&gt;https://shared-python-packages.s3-us-west-2.amazonaws.com&lt;/A&gt; nanads-0.0.latest-py3-none-any.whl&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;How do I give the job clusters the correct access?&lt;/P&gt;</description>
      <pubDate>Sat, 02 Apr 2022 11:02:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23943#M16613</guid>
      <dc:creator>Mr__E</dc:creator>
      <dc:date>2022-04-02T11:02:36Z</dc:date>
    </item>
    <item>
      <title>Re: Using shared python wheels for job compute clusters</title>
      <link>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23944#M16614</link>
      <description>&lt;P&gt;You can mount S3 as a DBFS folder then set that library in "cluster" -&amp;gt; "libraries" tab -&amp;gt; "install new" -&amp;gt; "DBFS"&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image.png"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1973iC8ABAFCF96FD82AB/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 02 Apr 2022 12:34:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23944#M16614</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-04-02T12:34:49Z</dc:date>
    </item>
    <item>
      <title>Re: Using shared python wheels for job compute clusters</title>
      <link>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23945#M16615</link>
      <description>&lt;P&gt;Thanks! This is what I'm already doing. It works fine for normal compute clusters, but it doesn't work for job clusters and gives the error mentioned above.&lt;/P&gt;</description>
      <pubDate>Sat, 02 Apr 2022 15:07:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23945#M16615</guid>
      <dc:creator>Mr__E</dc:creator>
      <dc:date>2022-04-02T15:07:58Z</dc:date>
    </item>
    <item>
      <title>Re: Using shared python wheels for job compute clusters</title>
      <link>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23946#M16616</link>
      <description>&lt;P&gt;yes, but you put in File path "s3://shared-python-packages..." not "/your_mount/shared-python-packages.."? (neither s3 path which includes access token)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;It looks like an authentication problem. Doing a permanent mount could solve the issue.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;dbutils.fs.mount("s3a://%s:%s@%s" % (access_key, encoded_secret_key, aws_bucket_name), "/mnt/%s" % mount_name)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;more here &lt;A href="https://docs.databricks.com/data/data-sources/aws/amazon-s3.html#mount-a-bucket-using-aws-keys" alt="https://docs.databricks.com/data/data-sources/aws/amazon-s3.html#mount-a-bucket-using-aws-keys" target="_blank"&gt;https://docs.databricks.com/data/data-sources/aws/...&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 02 Apr 2022 15:50:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23946#M16616</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-04-02T15:50:44Z</dc:date>
    </item>
    <item>
      <title>Re: Using shared python wheels for job compute clusters</title>
      <link>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23947#M16617</link>
      <description>&lt;P&gt;Yeah, it was an authentication issue. Turns out the compute clusters were set up with instance profiles, but never the job profiles, so when the wheel process was set up it failed for jobs.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;TL;DR: you need to apply instance profiles for access to shared resources on the compute cluster.&lt;/P&gt;</description>
      <pubDate>Sat, 02 Apr 2022 16:42:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23947#M16617</guid>
      <dc:creator>Mr__E</dc:creator>
      <dc:date>2022-04-02T16:42:19Z</dc:date>
    </item>
    <item>
      <title>Re: Using shared python wheels for job compute clusters</title>
      <link>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23948#M16618</link>
      <description>&lt;P&gt;@Erik Louie​&amp;nbsp;, we are glad that the issue is resolved. Could you please mark the best answer, so that the thread can be closed and will be helpful for other to refer.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Apr 2022 07:09:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-shared-python-wheels-for-job-compute-clusters/m-p/23948#M16618</guid>
      <dc:creator>Prabakar</dc:creator>
      <dc:date>2022-04-05T07:09:48Z</dc:date>
    </item>
  </channel>
</rss>

