cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Use Python code from a remote Git repository

ChingizK
New Contributor III

I'm trying to create a task where the source is a Python script located in remote GitLab repo. I'm following the instructions HERE and this is how I have the task set up:

03.png

However, no matter what path I specify all I get is the error below:

Cannot read the python file /Repos/.internal/4ac77871e5_commits/36c5bd9261b47a0bad5eaf99f32b9a2cd7032471/projects/membership-churn/mlops/workflow_notebooks_v1/02%20-%20Production%20Data%20ETL.py. Please check driver logs for more details

 I've searched the Community and found THIS and tried to follow the advice mentioned there.

The Logs show the below AssertionError:

23/09/21 19:25:24 WARN JupyterDriverLocal: User code returned error with traceback: ---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
File ~/.ipykernel/1132/command--1-426573936:2
      1 from os.path import exists
----> 2 assert(exists("/Workspace/Repos/.internal/4ac77871e5_commits/36c5bd9261b47a0bad5eaf99f32b9a2cd7032471/projects/membership-churn/mlops/workflow_notebooks_v1/02%20-%20Production%20Data%20ETL.py"))

 I'm not quite sure why is it trying to invoke a Jupyter Driver since the files in my repo are all *.py

What am I doing wrong here?

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @ChingizKThe issue you are experiencing might be because you are starting your path with a /. According to the provided information, when you enter the relative path, you should not begin it with / or ./.

For example, if the absolute path for the Python code you want to access is /python/covid_eda_raw.py, you should enter python/covid_eda_raw.py in the Path field.

So, in your case, instead of /Repos/.internal/4ac77871e5_commits/36c5bd9261b47a0bad5eaf99f32b9a2cd7032471/projects/membership-churn/mlops/workflow_notebooks_v1/02%20-%20Production%20Data%20ETL.py, you should use Repos/.internal/4ac77871e5_commits/36c5bd9261b47a0bad5eaf99f32b9a2cd7032471/projects/membership-churn/mlops/workflow_notebooks_v1/02%20-%20Production%20Data%20ETL.py.

Please adjust your path accordingly and try again.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group