10-27-2022 07:15 AM
By reading the documentation, we checked the possibility of running jobs in the Azure Databricks Workspace workflow using Azure DevOps Services repository source codes.
The instructions in the documentation were followed and we configured the git information menu with information from our repository.
To configure the git information menu (workflows > jobs > "test_azure_repos" [name of the job created for the test]> tasks > Source > Git Information), we checked the repository URL in the option to clone over HTTPS, as shown in figure 1 and we entered this address in the "Git repository URL" field in Figure 3. In addition, we include the branch name and provider type information.
Figure 1. URL address used to clone the repository over HTTPS
Figure 2. Inserting the value from figure 2 in the git information menu in the Databricks workspace.
In the Databricks workspace git provider configuration (User settings > Git integration), the configuration option chosen was "Azure DevOps Services (Azure Active Directory)", as shown in Figure 3.
Figure 3. Git provider configuration.
However, when executing the job, an error is returned indicating that the notebook was not found in the repository, as shown in Figure 4.
Figure 4. Error "Notebook not found".
The instructions for inserting the relative path of the repository and omitting the file extension were followed, as the documentation indicates.
The file path entered in the "Path" parameter of the task configuration was "src/test", which matches the structure of the Azure DevOps Service repository, as shown in Figure 6.
Figure 5. File path configuration
Figure 6. Repository structure.
In the "job run" we can verify that the identification of the commit of the repository is equivalent to what is in the remote repository. But still the execution generates this notebook not found error and accuses internal error, as shown in Figure 7
Tests were performed with Github and the executions were successful.
Figure 7. Job run with error
Could you please help me?
10-27-2022 12:34 PM
Can you check if the file test.py is a valid Databricks notebook i.e. it starts with
# Databricks notebook source
This error would occur if you were trying to run a python script instead of a Databricks notebook.
10-27-2022 12:53 PM
Oh, perfect Vaibhav! That was exactly the problem. I added the above content and it worked perfectly.
Thank you very much.
01-16-2023 12:15 AM
I have the same challenge when integrating with Github repos. However I did not succeed including: '# Databricks notebook source' in the top of python files. Do you have any additional suggestions for solving this problem? @Vaibhav Sethi
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group