<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Deploying Data Source API code in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/deploying-data-source-api-code/m-p/105192#M42034</link>
    <description>&lt;P&gt;This might be a stupid question but there's just no mention of what to do here.&amp;nbsp; I'm looking at the blog (&lt;A href="https://www.databricks.com/blog/simplify-data-ingestion-new-python-data-source-api" target="_blank"&gt;https://www.databricks.com/blog/simplify-data-ingestion-new-python-data-source-api&lt;/A&gt;) and documentation (&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/pyspark/datasources" target="_blank"&gt;https://learn.microsoft.com/en-us/azure/databricks/pyspark/datasources&lt;/A&gt;) for the Python Data Source API, and I don't see how to deploy the custom library.&amp;nbsp; Do we need to create a wheel file and upload it?&amp;nbsp; Do we use regular .py files in our workspace and %run them?&amp;nbsp; Any guidance would be appreciated.&lt;/P&gt;</description>
    <pubDate>Fri, 10 Jan 2025 14:33:03 GMT</pubDate>
    <dc:creator>Rjdudley</dc:creator>
    <dc:date>2025-01-10T14:33:03Z</dc:date>
    <item>
      <title>Deploying Data Source API code</title>
      <link>https://community.databricks.com/t5/data-engineering/deploying-data-source-api-code/m-p/105192#M42034</link>
      <description>&lt;P&gt;This might be a stupid question but there's just no mention of what to do here.&amp;nbsp; I'm looking at the blog (&lt;A href="https://www.databricks.com/blog/simplify-data-ingestion-new-python-data-source-api" target="_blank"&gt;https://www.databricks.com/blog/simplify-data-ingestion-new-python-data-source-api&lt;/A&gt;) and documentation (&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/pyspark/datasources" target="_blank"&gt;https://learn.microsoft.com/en-us/azure/databricks/pyspark/datasources&lt;/A&gt;) for the Python Data Source API, and I don't see how to deploy the custom library.&amp;nbsp; Do we need to create a wheel file and upload it?&amp;nbsp; Do we use regular .py files in our workspace and %run them?&amp;nbsp; Any guidance would be appreciated.&lt;/P&gt;</description>
      <pubDate>Fri, 10 Jan 2025 14:33:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/deploying-data-source-api-code/m-p/105192#M42034</guid>
      <dc:creator>Rjdudley</dc:creator>
      <dc:date>2025-01-10T14:33:03Z</dc:date>
    </item>
    <item>
      <title>Re: Deploying Data Source API code</title>
      <link>https://community.databricks.com/t5/data-engineering/deploying-data-source-api-code/m-p/105195#M42035</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/107723"&gt;@Rjdudley&lt;/a&gt;,&lt;/P&gt;
&lt;P class="p1"&gt;Thanks for your question - You can create regular .py files in your workspace and use the %run magic command to include them in your notebooks. This method is straightforward and good for development and testing.&lt;/P&gt;
&lt;P class="p1"&gt;%run /path/to/your/custom_datasource_file&lt;/P&gt;
&lt;P class="p1"&gt;For a more production-ready approach, you can create a wheel file of your custom data source implementation and upload it to your cluster or workspace. This method is preferred for sharing across multiple notebooks or jobs&lt;/P&gt;
&lt;P class="p1"&gt;•Package your code into a wheel file&lt;/P&gt;
&lt;P class="p1"&gt;•Upload the wheel file to your Databricks workspace or a accessible location (e.g., DBFS)&lt;/P&gt;
&lt;P class="p1"&gt;•Install the wheel file on your cluster using init scripts or pip install commands&lt;/P&gt;
&lt;P class="p1"&gt;You can also package your custom data source as a library and install it directly on your cluster&lt;/P&gt;</description>
      <pubDate>Fri, 10 Jan 2025 14:51:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/deploying-data-source-api-code/m-p/105195#M42035</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2025-01-10T14:51:56Z</dc:date>
    </item>
    <item>
      <title>Re: Deploying Data Source API code</title>
      <link>https://community.databricks.com/t5/data-engineering/deploying-data-source-api-code/m-p/105242#M42057</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/106294"&gt;@Alberto_Umana&lt;/a&gt;&amp;nbsp;Brilliant, thank you!&lt;/P&gt;</description>
      <pubDate>Fri, 10 Jan 2025 18:46:35 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/deploying-data-source-api-code/m-p/105242#M42057</guid>
      <dc:creator>Rjdudley</dc:creator>
      <dc:date>2025-01-10T18:46:35Z</dc:date>
    </item>
    <item>
      <title>Re: Deploying Data Source API code</title>
      <link>https://community.databricks.com/t5/data-engineering/deploying-data-source-api-code/m-p/105245#M42059</link>
      <description>&lt;P&gt;You're very welcome!&lt;/P&gt;</description>
      <pubDate>Fri, 10 Jan 2025 18:52:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/deploying-data-source-api-code/m-p/105245#M42059</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2025-01-10T18:52:57Z</dc:date>
    </item>
  </channel>
</rss>

