cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Import a notebook in a Release Pipeline with a Python script

RantoB
Valued Contributor

Hi,

I would like to import a python file to Databricks with a Azure DevOps Release Pipeline.

Within the pipeline I execute a python script which contains this code :

import sys
import os
import base64
import requests
 
dbw_url = sys.argv[1] # https://adb-XXXXXXXXXXXXX.XX.azuredatabricks.net/
token = sys.argv[2] # databricks PAT
root_source = os.path.join(os.environ.get('SYSTEM_DEFAULTWORKINGDIRECTORY'), '_Build Notebook Artifact', 'artifact_dir_path') # This is a result from a build pipeline
target_dir_path = '/Shared'
file = os.listdir(root_source)[0]
print(file)
 
with open(os.path.join(root_source, file), 'rb') as f:
    data = base64.standard_b64encode(f.read()).decode('utf-8')
 
json = {
    "content": data,
    "path": os.path.join(target_dir_path, file),
    "language": "PYTHON",
    "overwrite": True,
    "format": "SOURCE"      
}
 
import_notebook = requests.post(
  '{}/api/2.0/workspace/import'.format(dbw_url),
  headers={'Authorization': 'Bearer {}'.format(token)},
  json=json
)
 
print(import_notebook.status_code) # -> 200

Status code is 200 but nothing has been imported in my Databricks Workspace.

Here is what I have in my pipeline logs:

2021-11-15T14:39:54.0421229Z ##sectionStarting: Run a Python script
2021-11-15T14:39:54.0429015Z ==============================================================================
2021-11-15T14:39:54.0429348Z Task         : Python script
2021-11-15T14:39:54.0429590Z Description  : Run a Python file or inline script
2021-11-15T14:39:54.0429815Z Version      : 0.182.0
2021-11-15T14:39:54.0430025Z Author       : Microsoft Corporation
2021-11-15T14:39:54.0430358Z Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/python-script
2021-11-15T14:39:54.0430694Z ==============================================================================
2021-11-15T14:39:54.1829847Z [command]/opt/hostedtoolcache/Python/3.10.0/x64/bin/python /home/vsts/work/_temp/2dfc3151-6ce7-4c6d-a74e-59125c767241.py *** ***
2021-11-15T14:39:54.4797958Z ingest_csv.py
2021-11-15T14:39:54.4798454Z 200
2021-11-15T14:39:54.5018448Z ##sectionFinishing: Run a Python script

This is working fine when I execute it from my local machine.

Note: weird but when I put a ***** value in the 'token' variable, I obtain the same result : status code 200. However, as expected I obtain an error from my local machine.

Thanks for your help.

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

Recently I wrote about alternative way to export/import notebooks in pthon https://community.databricks.com/s/question/0D53f00001TgT52CAF/import-notebook-with-python-script-us... This way you will get more readable error/message (often it is related to host name or access rights).

    pip install databricks-cli
    from databricks_cli.workspace.api import WorkspaceApi
    from databricks_cli.sdk.api_client import ApiClient
     
    client = ApiClient(
        host='https://your.databricks-url.net',
        token=api_key
    )
    workspace_api = WorkspaceApi(client)
    workspace_api.import_workspace(
        source_path="/your/dir/here/hello.py",
        target_path="/Repos/test/hello.py",
        overwrite=True
    )

View solution in original post

2 REPLIES 2

Hubert-Dudek
Esteemed Contributor III

Recently I wrote about alternative way to export/import notebooks in pthon https://community.databricks.com/s/question/0D53f00001TgT52CAF/import-notebook-with-python-script-us... This way you will get more readable error/message (often it is related to host name or access rights).

    pip install databricks-cli
    from databricks_cli.workspace.api import WorkspaceApi
    from databricks_cli.sdk.api_client import ApiClient
     
    client = ApiClient(
        host='https://your.databricks-url.net',
        token=api_key
    )
    workspace_api = WorkspaceApi(client)
    workspace_api.import_workspace(
        source_path="/your/dir/here/hello.py",
        target_path="/Repos/test/hello.py",
        overwrite=True
    )

RantoB
Valued Contributor

okay, I will use the dedicated python api.

Thanks

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group