<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Use Python code from a remote Git repository in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/use-python-code-from-a-remote-git-repository/m-p/45593#M27942</link>
    <description>&lt;P&gt;&lt;SPAN&gt;I'm trying to create a task where the source is a Python script located in remote GitLab repo. I'm following the instructions &lt;A href="https://docs.databricks.com/en/workflows/jobs/how-to/use-repos.html#use-python-code-from-a-remote-git-repository" target="_self"&gt;HERE&lt;/A&gt;&amp;nbsp;and this is how I have the task set up:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="03.png" style="width: 823px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/3856i49CB94974F27A09F/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999" role="button" title="03.png" alt="03.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;However, no matter what path I specify all I get is the error below:&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Cannot read the python file /Repos/.internal/4ac77871e5_commits/36c5bd9261b47a0bad5eaf99f32b9a2cd7032471/projects/membership-churn/mlops/workflow_notebooks_v1/02%20-%20Production%20Data%20ETL.py. Please check driver logs for more details&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;I've searched the Community and found &lt;A href="https://community.databricks.com/t5/data-engineering/unable-to-run-python-script-from-git-repo-in-databricks-job/td-p/5157" target="_self"&gt;THIS&lt;/A&gt;&amp;nbsp;and tried to follow the advice mentioned there.&lt;/P&gt;&lt;P&gt;The Logs show the below &lt;STRONG&gt;AssertionError&lt;/STRONG&gt;:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;23/09/21 19:25:24 WARN JupyterDriverLocal: User code returned error with traceback: &amp;#27;[0;31m---------------------------------------------------------------------------&amp;#27;[0m
&amp;#27;[0;31mAssertionError&amp;#27;[0m                            Traceback (most recent call last)
File &amp;#27;[0;32m~/.ipykernel/1132/command--1-426573936:2&amp;#27;[0m
&amp;#27;[1;32m      1&amp;#27;[0m &amp;#27;[38;5;28;01mfrom&amp;#27;[39;00m &amp;#27;[38;5;21;01mos&amp;#27;[39;00m&amp;#27;[38;5;21;01m.&amp;#27;[39;00m&amp;#27;[38;5;21;01mpath&amp;#27;[39;00m &amp;#27;[38;5;28;01mimport&amp;#27;[39;00m exists
&amp;#27;[0;32m----&amp;gt; 2&amp;#27;[0m &amp;#27;[38;5;28;01massert&amp;#27;[39;00m(exists(&amp;#27;[38;5;124m"&amp;#27;[39m&amp;#27;[38;5;124m/Workspace/Repos/.internal/4ac77871e5_commits/36c5bd9261b47a0bad5eaf99f32b9a2cd7032471/projects/membership-churn/mlops/workflow_notebooks_v1/02&amp;#27;[39m&amp;#27;[38;5;124m%&amp;#27;[39m&amp;#27;[38;5;124m20-&amp;#27;[39m&amp;#27;[38;5;124m%&amp;#27;[39m&amp;#27;[38;5;124m20Production&amp;#27;[39m&amp;#27;[38;5;124m%&amp;#27;[39m&amp;#27;[38;5;124m20Data&amp;#27;[39m&amp;#27;[38;5;132;01m%20E&amp;#27;[39;00m&amp;#27;[38;5;124mTL.py&amp;#27;[39m&amp;#27;[38;5;124m"&amp;#27;[39m))&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;I'm not quite sure why is it trying to invoke a Jupyter Driver since the files in my repo are all &lt;EM&gt;*.py&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;What am I doing wrong here?&lt;/P&gt;</description>
    <pubDate>Thu, 21 Sep 2023 20:38:28 GMT</pubDate>
    <dc:creator>ChingizK</dc:creator>
    <dc:date>2023-09-21T20:38:28Z</dc:date>
    <item>
      <title>Use Python code from a remote Git repository</title>
      <link>https://community.databricks.com/t5/data-engineering/use-python-code-from-a-remote-git-repository/m-p/45593#M27942</link>
      <description>&lt;P&gt;&lt;SPAN&gt;I'm trying to create a task where the source is a Python script located in remote GitLab repo. I'm following the instructions &lt;A href="https://docs.databricks.com/en/workflows/jobs/how-to/use-repos.html#use-python-code-from-a-remote-git-repository" target="_self"&gt;HERE&lt;/A&gt;&amp;nbsp;and this is how I have the task set up:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="03.png" style="width: 823px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/3856i49CB94974F27A09F/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999" role="button" title="03.png" alt="03.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;However, no matter what path I specify all I get is the error below:&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Cannot read the python file /Repos/.internal/4ac77871e5_commits/36c5bd9261b47a0bad5eaf99f32b9a2cd7032471/projects/membership-churn/mlops/workflow_notebooks_v1/02%20-%20Production%20Data%20ETL.py. Please check driver logs for more details&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;I've searched the Community and found &lt;A href="https://community.databricks.com/t5/data-engineering/unable-to-run-python-script-from-git-repo-in-databricks-job/td-p/5157" target="_self"&gt;THIS&lt;/A&gt;&amp;nbsp;and tried to follow the advice mentioned there.&lt;/P&gt;&lt;P&gt;The Logs show the below &lt;STRONG&gt;AssertionError&lt;/STRONG&gt;:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;23/09/21 19:25:24 WARN JupyterDriverLocal: User code returned error with traceback: &amp;#27;[0;31m---------------------------------------------------------------------------&amp;#27;[0m
&amp;#27;[0;31mAssertionError&amp;#27;[0m                            Traceback (most recent call last)
File &amp;#27;[0;32m~/.ipykernel/1132/command--1-426573936:2&amp;#27;[0m
&amp;#27;[1;32m      1&amp;#27;[0m &amp;#27;[38;5;28;01mfrom&amp;#27;[39;00m &amp;#27;[38;5;21;01mos&amp;#27;[39;00m&amp;#27;[38;5;21;01m.&amp;#27;[39;00m&amp;#27;[38;5;21;01mpath&amp;#27;[39;00m &amp;#27;[38;5;28;01mimport&amp;#27;[39;00m exists
&amp;#27;[0;32m----&amp;gt; 2&amp;#27;[0m &amp;#27;[38;5;28;01massert&amp;#27;[39;00m(exists(&amp;#27;[38;5;124m"&amp;#27;[39m&amp;#27;[38;5;124m/Workspace/Repos/.internal/4ac77871e5_commits/36c5bd9261b47a0bad5eaf99f32b9a2cd7032471/projects/membership-churn/mlops/workflow_notebooks_v1/02&amp;#27;[39m&amp;#27;[38;5;124m%&amp;#27;[39m&amp;#27;[38;5;124m20-&amp;#27;[39m&amp;#27;[38;5;124m%&amp;#27;[39m&amp;#27;[38;5;124m20Production&amp;#27;[39m&amp;#27;[38;5;124m%&amp;#27;[39m&amp;#27;[38;5;124m20Data&amp;#27;[39m&amp;#27;[38;5;132;01m%20E&amp;#27;[39;00m&amp;#27;[38;5;124mTL.py&amp;#27;[39m&amp;#27;[38;5;124m"&amp;#27;[39m))&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;I'm not quite sure why is it trying to invoke a Jupyter Driver since the files in my repo are all &lt;EM&gt;*.py&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;What am I doing wrong here?&lt;/P&gt;</description>
      <pubDate>Thu, 21 Sep 2023 20:38:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/use-python-code-from-a-remote-git-repository/m-p/45593#M27942</guid>
      <dc:creator>ChingizK</dc:creator>
      <dc:date>2023-09-21T20:38:28Z</dc:date>
    </item>
  </channel>
</rss>

