<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Running Libraries and/or modules in Databricks' lifecycle? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/running-libraries-and-or-modules-in-databricks-lifecycle/m-p/3420#M465</link>
    <description>&lt;P&gt;Hi, i have had this question for some weeks and didn't find any information about the topic. Specifically, my doubt is: what is the 'lifecycle' or cycle or steps to be able to use a new Python library in Databricks in terms of compatibility? For example, if I wanted to use Numba or Cython in Databricks, is this possible? And what about libraries that provide parallelism like running 'Dask' on top of a cluster framework that already allows distributed computation, is this possible, and if so, how does it work?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm not sure if I'm making myself understood &lt;span class="lia-unicode-emoji" title=":grinning_face_with_sweat:"&gt;😅&lt;/span&gt;&lt;/P&gt;&lt;P&gt;If someone knows, could you share any resources to delve into the topic? Thank you very much, friends!&lt;/P&gt;</description>
    <pubDate>Thu, 08 Jun 2023 00:12:31 GMT</pubDate>
    <dc:creator>carlosst01</dc:creator>
    <dc:date>2023-06-08T00:12:31Z</dc:date>
    <item>
      <title>Running Libraries and/or modules in Databricks' lifecycle?</title>
      <link>https://community.databricks.com/t5/data-engineering/running-libraries-and-or-modules-in-databricks-lifecycle/m-p/3420#M465</link>
      <description>&lt;P&gt;Hi, i have had this question for some weeks and didn't find any information about the topic. Specifically, my doubt is: what is the 'lifecycle' or cycle or steps to be able to use a new Python library in Databricks in terms of compatibility? For example, if I wanted to use Numba or Cython in Databricks, is this possible? And what about libraries that provide parallelism like running 'Dask' on top of a cluster framework that already allows distributed computation, is this possible, and if so, how does it work?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm not sure if I'm making myself understood &lt;span class="lia-unicode-emoji" title=":grinning_face_with_sweat:"&gt;😅&lt;/span&gt;&lt;/P&gt;&lt;P&gt;If someone knows, could you share any resources to delve into the topic? Thank you very much, friends!&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jun 2023 00:12:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/running-libraries-and-or-modules-in-databricks-lifecycle/m-p/3420#M465</guid>
      <dc:creator>carlosst01</dc:creator>
      <dc:date>2023-06-08T00:12:31Z</dc:date>
    </item>
    <item>
      <title>Re: Running Libraries and/or modules in Databricks' lifecycle?</title>
      <link>https://community.databricks.com/t5/data-engineering/running-libraries-and-or-modules-in-databricks-lifecycle/m-p/3421#M466</link>
      <description>&lt;P&gt;Basically,  it is possible.  &lt;/P&gt;&lt;P&gt;In essence databricks delivers virtual machines with a linux base and spark installed.  &lt;/P&gt;&lt;P&gt;If you want to run other software on that hardware, it is probably possible.&lt;/P&gt;&lt;P&gt;&lt;A href="https://medium.com/behindthewires/dask-on-azure-databricks-37b5a1537595" alt="https://medium.com/behindthewires/dask-on-azure-databricks-37b5a1537595" target="_blank"&gt;Here &lt;/A&gt;f.e. someone installed dask on databricks.  And &lt;A href="https://www.databricks.com/blog/2023/02/28/announcing-ray-support-databricks-and-apache-spark-clusters.html" alt="https://www.databricks.com/blog/2023/02/28/announcing-ray-support-databricks-and-apache-spark-clusters.html" target="_blank"&gt;here Databricks with Ray&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;But as you can see: Databricks is not a general distributed compute platform.  It is very Spark oriented.  So your attempts might be succesful, or not.  Depends on the library.&lt;/P&gt;&lt;P&gt;Cython f.e. will probably work, but my guess is that it wil only run on the driver.&lt;/P&gt;&lt;P&gt;So it might be easier to setup a Cython VM yourself.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 07:35:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/running-libraries-and-or-modules-in-databricks-lifecycle/m-p/3421#M466</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2023-06-12T07:35:01Z</dc:date>
    </item>
    <item>
      <title>Re: Running Libraries and/or modules in Databricks' lifecycle?</title>
      <link>https://community.databricks.com/t5/data-engineering/running-libraries-and-or-modules-in-databricks-lifecycle/m-p/3422#M467</link>
      <description>&lt;P&gt;Hi @Carlos Caravantes​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for posting your question in our community! We are happy to assist you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Jun 2023 07:18:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/running-libraries-and-or-modules-in-databricks-lifecycle/m-p/3422#M467</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-06-13T07:18:26Z</dc:date>
    </item>
  </channel>
</rss>

