cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Import a notebook in a Release Pipeline with a Python script

RantoB
Valued Contributor

Hi,

I would like to import a python file to Databricks with a Azure DevOps Release Pipeline.

Within the pipeline I execute a python script which contains this code :

import sys
import os
import base64
import requests
 
dbw_url = sys.argv[1] # https://adb-XXXXXXXXXXXXX.XX.azuredatabricks.net/
token = sys.argv[2] # databricks PAT
root_source = os.path.join(os.environ.get('SYSTEM_DEFAULTWORKINGDIRECTORY'), '_Build Notebook Artifact', 'artifact_dir_path') # This is a result from a build pipeline
target_dir_path = '/Shared'
file = os.listdir(root_source)[0]
print(file)
 
with open(os.path.join(root_source, file), 'rb') as f:
    data = base64.standard_b64encode(f.read()).decode('utf-8')
 
json = {
    "content": data,
    "path": os.path.join(target_dir_path, file),
    "language": "PYTHON",
    "overwrite": True,
    "format": "SOURCE"      
}
 
import_notebook = requests.post(
  '{}/api/2.0/workspace/import'.format(dbw_url),
  headers={'Authorization': 'Bearer {}'.format(token)},
  json=json
)
 
print(import_notebook.status_code) # -> 200

Status code is 200 but nothing has been imported in my Databricks Workspace.

Here is what I have in my pipeline logs:

2021-11-15T14:39:54.0421229Z ##sectionStarting: Run a Python script
2021-11-15T14:39:54.0429015Z ==============================================================================
2021-11-15T14:39:54.0429348Z Task         : Python script
2021-11-15T14:39:54.0429590Z Description  : Run a Python file or inline script
2021-11-15T14:39:54.0429815Z Version      : 0.182.0
2021-11-15T14:39:54.0430025Z Author       : Microsoft Corporation
2021-11-15T14:39:54.0430358Z Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/python-script
2021-11-15T14:39:54.0430694Z ==============================================================================
2021-11-15T14:39:54.1829847Z [command]/opt/hostedtoolcache/Python/3.10.0/x64/bin/python /home/vsts/work/_temp/2dfc3151-6ce7-4c6d-a74e-59125c767241.py *** ***
2021-11-15T14:39:54.4797958Z ingest_csv.py
2021-11-15T14:39:54.4798454Z 200
2021-11-15T14:39:54.5018448Z ##sectionFinishing: Run a Python script

This is working fine when I execute it from my local machine.

Note: weird but when I put a ***** value in the 'token' variable, I obtain the same result : status code 200. However, as expected I obtain an error from my local machine.

Thanks for your help.

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

Recently I wrote about alternative way to export/import notebooks in pthon https://community.databricks.com/s/question/0D53f00001TgT52CAF/import-notebook-with-python-script-us... This way you will get more readable error/message (often it is related to host name or access rights).

    pip install databricks-cli
    from databricks_cli.workspace.api import WorkspaceApi
    from databricks_cli.sdk.api_client import ApiClient
     
    client = ApiClient(
        host='https://your.databricks-url.net',
        token=api_key
    )
    workspace_api = WorkspaceApi(client)
    workspace_api.import_workspace(
        source_path="/your/dir/here/hello.py",
        target_path="/Repos/test/hello.py",
        overwrite=True
    )

View solution in original post

2 REPLIES 2

Hubert-Dudek
Esteemed Contributor III

Recently I wrote about alternative way to export/import notebooks in pthon https://community.databricks.com/s/question/0D53f00001TgT52CAF/import-notebook-with-python-script-us... This way you will get more readable error/message (often it is related to host name or access rights).

    pip install databricks-cli
    from databricks_cli.workspace.api import WorkspaceApi
    from databricks_cli.sdk.api_client import ApiClient
     
    client = ApiClient(
        host='https://your.databricks-url.net',
        token=api_key
    )
    workspace_api = WorkspaceApi(client)
    workspace_api.import_workspace(
        source_path="/your/dir/here/hello.py",
        target_path="/Repos/test/hello.py",
        overwrite=True
    )

RantoB
Valued Contributor

okay, I will use the dedicated python api.

Thanks

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.