cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Import notebook with python script using API

RantoB
Valued Contributor

Hi,

I would like to import a python notebook to my databricks workspace from my local machine using a python script.

I manages to create the folder but then I have a status code 400 when I try to import a file :

create_folder = requests.post(
  '{}/api/2.0/workspace/mkdirs'.format(DBW_URL),
  headers={'Authorization': 'Bearer {}'.format(TOKEN)},
  json={"path": "/Repos/test"}
)
 
print(create_folder.status_code) # -> 200
 
python_code = """
# Databricks notebook source
print("This notebook has been imported via API.")
"""
 
data = base64.standard_b64encode(python_code.encode('utf-8')).decode('utf-8')
 
import_notebook = requests.post(
  '{}/api/2.0/workspace/import'.format(DBW_URL),
  headers={'Authorization': 'Bearer {}'.format(TOKEN)},
  json={
      "content": data,
      "path": "/Repos/test/hello.py",
      "language": "PYTHON",
      "overwrite": True,
      "format": "SOURCE"      
  }
)
 
print(import_notebook.status_code) # -> 400

I am not sure about the way I encoded the "content" value but I d'ont think this is the problem.

Thansk for your help.

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

you can make your life easier and use cli api:

pip install databricks-cli

and then:

from databricks_cli.workspace.api import WorkspaceApi
from databricks_cli.sdk.api_client import ApiClient
 
client = ApiClient(
    host='https://your.databricks-url.net',
    token=api_key
)
workspace_api = WorkspaceApi(client)
workspace_api.import_workspace(
    source_path="/your/dir/here/hello.py",
    target_path="/Repos/test/hello.py",
    overwrite=True
)

View solution in original post

10 REPLIES 10

cconnell
Contributor II

Where is this python code running, on your local machine or in Databricks?

RantoB
Valued Contributor

on my local machine.

Hubert-Dudek
Esteemed Contributor III

you can make your life easier and use cli api:

pip install databricks-cli

and then:

from databricks_cli.workspace.api import WorkspaceApi
from databricks_cli.sdk.api_client import ApiClient
 
client = ApiClient(
    host='https://your.databricks-url.net',
    token=api_key
)
workspace_api = WorkspaceApi(client)
workspace_api.import_workspace(
    source_path="/your/dir/here/hello.py",
    target_path="/Repos/test/hello.py",
    overwrite=True
)

RantoB
Valued Contributor

Hi, Thanks for your answer.

Actually both your code and mine are working. However, I cannot write in the directory Repos which is reserved (but I can create subdirectories...)

Thanks to your code I got an error message which helped me to understand. With my code I had no error message.

Hubert-Dudek
Esteemed Contributor III

any chance to be selected as the best answer 🙂 ? 😉

RantoB
Valued Contributor

Hi @Hubert Dudek​ ,

Do you know where I can find the documentation about the pythno api for databricks ?

Besides, do you know how to launch a job or notebook remotely with python api ?

Thanks

Hubert-Dudek
Esteemed Contributor III

https://docs.databricks.com/dev-tools/cli/index.html

but I checked what is working there on github as cli docs are more for...cli not sdk https://github.com/databricks/databricks-cli/tree/master/databricks_cli/sdk

there is class JobsService and method run_now. I think is what you are looking for.

RantoB
Valued Contributor

I finally found. What I needed was there :

from databricks_cli.jobs.api import JobsApi
from databricks_cli.sdk.api_client import ApiClient

But I had to guess based on what you told me about databricks_cli.workspace.api

Ramya
New Contributor III

Hi All,

I tried the same code but it's not working for me. I keep getting

 import_workspace() got an unexpected keyword argument 'format'.

token_credential = DefaultAzureCredential()

scope = "2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default"

token = token_credential.get_token(scope)

access_token = str(token.token)

from databricks_cli.workspace.api import WorkspaceApi

from databricks_cli.sdk.api_client import ApiClient

client = ApiClient(

  host='https://your.databricks-url.net',

  token=access_token

)

workspace_api = WorkspaceApi(client)

workspace_api.import_workspace(

  source_path="/home/hello.py",

  target_path="/repo/test/hello.py",

  language="PYTHON",

  format = "SOURCE"

)

Ramya
New Contributor III

Looks like the recent import_workspace api parameter is different. It worked after changed the correct parameter

workspace_api.import_workspace(

  source_path="/home/hello.py",

  target_path="/repo/test/hello.py",

  language="PYTHON",

   fmt="SOURCE",

  is_overwrite = True

)

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.