cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Rewrite Notebooks Programatically

Avinash_Narala
New Contributor III

Hello,

I want to refactor the notebook programatically. So, written the code as follows:

 

import requests
import base64

# Databricks Workspace API URLs
workspace_url = f"{host}/api/2.0/workspace"
export_url = f"{workspace_url}/export"
import_url = f"{workspace_url}/import"

# Databricks personal access token
token = "***********************"

# Notebook path
notebook_path = "/Workspace/Users/Avinash/Old_Notebook"
new_notebook_path = "/Workspace/Users/Avinash/New_notebook"

# Function to export notebook content
def export_notebook():
    export_payload = {
        "path": notebook_path,
        "format": "SOURCE"
    }
    response = requests.get(export_url, headers=headers, json=export_payload)
    response.raise_for_status()
    return response.json()["content"]

# Function to refactor notebook content
def refactor_notebook(content😞
    # Example: Replace all occurrences of 'old_variable' with 'new_variable'
    new_content = content.replace("hive_metastore", "test_catalog_ph")
    return new_content

# Function to import modified notebook content
def import_notebook(new_content):
    import_payload = {
        "path": new_notebook_path,
        "content": new_content,
        # "overwrite": True
    }
    response = requests.post(import_url, headers=headers, json=import_payload)
    if response.status_code != 200:
        print(response.content)
    response.raise_for_status()
 
# Main script
if __name__ == "__main__":
    # Export notebook content
    notebook_content = export_notebook()
    decoded_content = base64.b64decode(notebook_content).decode('utf-8')

    # print(decoded_content)
    # Refactor notebook content
    new_notebook_content = refactor_notebook(decoded_content)

    encoded_content=(base64.b64encode(new_notebook_content.encode('utf-8')))

    #print(encoded_content)
    #print(base64.b64decode(encoded_content).decode('utf-8'))

    # Import modified content back to Databricks
    import_notebook(encoded_content)

    print("Notebook refactoring complete.")

 

 

I am able to rewrite the content by exporting the notebook as Json. But unable to import the notebook from the Json to my workspace and getting the error as:

HTTPError: 400 Client Error: Bad Request for url.

Can you please help me with this

8 REPLIES 8

Kaniz
Community Manager
Community Manager

Hi @Avinash_NaralaThe HTTPError 400 indicates a Bad Request, which means there might be an issue with the request you’re making to the Databricks API.

Here are some steps to help you resolve this:

  1. Check the Request Payload:

    • Ensure that the payload you’re sending in the import_notebook function is correctly formatted.
    • Verify that the content, path, language, and other parameters are set appropriately.
    • Double-check the JSON structure and make sure it matches the expected format for importing a notebook.
  2. Content Encoding:

    • You’ve encoded the notebook content using base64. Make sure that the encoding and decoding process is consistent.
    • Confirm that the data variable contains the base64-encoded content of the modified notebook.
  3. Path and Overwrite:

    • Verify that the new_notebook_path is a valid path in your Databricks workspace.
    • If you’re overwriting an existing notebook, uncomment the "overwrite": True line in your import_notebook function.
  4. Permissions and Workspace Configuration:

    • Ensure that your Databricks personal access token (TOKEN) has the necessary permissions to create or overwrite notebooks in the specified path.
    • Check if there are any workspace-specific configurations or restrictions that might affect the import process.
  5. Use Databricks CLI API (Recommended):

    • Consider using the Databricks CLI API for importing notebooks. It simplifies the process and provides better error handling.
    • Install the Databricks CLI (pip install databricks-cli) and use the import_workspace method.

If you continue to face issues, feel free to provide additional details or error messages, and we’ll work through them together! 🚀

Avinash_Narala
New Contributor III

Hello Kaniz,

I've implemented steps(1-4) as you instructed, but the issue I am facing is while importing my notebook.

In detaiI, I can export my notebook as json an can work with the json(replacing some specific words), but while uploading that json as notebook I am not able to do it.

I tried in two ways,

1.overwrite the existing notebook.

2.creating new notebook.

So, If you provide How can I create a notebook in my databricks workspace with the json content I have. It will be really helpful .

Thank you.

Hi @Kaniz,

any update on this?

Kaniz
Community Manager
Community Manager

Hi @Avinash_NaralaLet’s create a new notebook in your Databricks workspace using the modified JSON content you have. Below are the steps to achieve this programmatically:

  1. Create a New Notebook:

    • To create a new notebook, you’ll need to use the Databricks REST API.
    • Make sure you have the necessary permissions to create notebooks in your workspace.
  2. Prepare the Import Payload:

    • The import_payload dictionary should include the following parameters:
      • "path": The path where you want to create the new notebook (e.g., "/Workspace/Users/Avinash/New_notebook").
      • "content": The modified notebook content in base64-encoded JSON format.
      • Optionally, you can include "overwrite": True if you want to overwrite an existing notebook with the same path.
  3. Send a POST Request:

    • Use the requests.post() method to send a POST request to the Databricks import URL (import_url).
    • Include the headers with the authorization token (token).
  4. Handle the Response:

    • Check the response status code. If it’s 200, the notebook was successfully created.
    • If there’s an error (e.g., 400 Bad Request), print the response content to debug the issue.

Here’s an updated version of your script with the necessary modifications for creating a new notebook:

import requests
import base64

# Databricks Workspace API URLs
workspace_url = f"{host}/api/2.0/workspace"
import_url = f"{workspace_url}/import"

# Databricks personal access token
token = "***********************"

# New notebook path
new_notebook_path = "/Workspace/Users/Avinash/New_notebook"

# Function to import modified notebook content
def import_notebook(new_content):
    import_payload = {
        "path": new_notebook_path,
        "content": new_content,
        # Uncomment the line below if you want to overwrite an existing notebook
        # "overwrite": True
    }
    response = requests.post(import_url, headers=headers, json=import_payload)
    if response.status_code != 200:
        print(response.content)
    response.raise_for_status()

# Main script
if __name__ == "__main__":
    # ... (Your existing code for exporting and modifying the notebook content)

    # Refactor notebook content
    new_notebook_content = refactor_notebook(decoded_content)

    # Encode the modified content
    encoded_content = base64.b64encode(new_notebook_content.encode('utf-8'))

    # Import modified content back to Databricks
    import_notebook(encoded_content)

    print("Notebook refactoring completed successfully!")

Make sure to replace the placeholders (host, new_notebook_path, etc.) with your actual values. If you encounter any issues, check the response content for more details. Good luck, and feel free to ask if you need further assistance! 😊

Avinash_Narala
New Contributor III

Same error message:

Attaching the error message

Hi @Avinash_Narala , print the response content to see if there are any additional details. Use print(response.content) after the import request to get more information about the issue.

Avinash_Narala
New Contributor III

Hi @Kaniz,

Attaching the response.content

 

Hi @Avinash_NaralaThank you for sharing the response content.

Let’s address the issue you’re facing while importing the notebook from JSON.

The error message you received indicates a Bad Request. To troubleshoot this, let’s focus on the import process. Here are some steps to help you create a new notebook using the provided JSON content:

  1. Decode the Base64 Content:

    • First, ensure that you’ve correctly decoded the base64-encoded content from the response. The decoded content should represent a valid Databricks notebook in JSON format.
  2. Create a New Notebook:

    • Use the Databricks Workspace API to create a new notebook.
    • Set the new_notebook_path variable to the desired path where you want to create the new notebook.
    • Make sure the path is unique and does not conflict with existing notebooks.
  3. HTTP Request for Creating a New Notebook:

    • Send an HTTP POST request to the Databricks API endpoint for creating a new notebook (import_url).
    • Include the following parameters in the request payload:
      • "path": The notebook path where you want to create the new notebook.
      • "content": The decoded content of the notebook in JSON format.
  4. Optional: Overwrite Existing Notebook:

    • If you intend to overwrite an existing notebook with the same path, uncomment the "overwrite": True line in the request payload.
  5. Inspect the Response:

    • Capture the response content when making the API request. Print it out to get more details about the issue.
    • Modify the import_notebook function to print the response content (similar to what we did earlier).
  6. Check Authentication and Permissions:

    • Ensure that your personal access token (token) is valid and has the necessary permissions to create notebooks.
    • Verify that the token is correctly included in the request headers.

Here’s a modified version of the script to create a new notebook using the provided content:

import requests
import base64

# Databricks Workspace API URLs
workspace_url = f"{host}/api/2.0/workspace"
import_url = f"{workspace_url}/import"

# Databricks personal access token
token = "***********************"

# New notebook path
new_notebook_path = "/Workspace/Users/Avinash/New_notebook"

# Function to import notebook content
def import_notebook(new_content):
    import_payload = {
        "path": new_notebook_path,
        "content": new_content,
        # "overwrite": True  # Uncomment if you want to overwrite an existing notebook
    }
    response = requests.post(import_url, headers=headers, json=import_payload)
    if response.status_code != 200:
        print(response.content)  # Print the response content
    response.raise_for_status()

# Main script
if __name__ == "__main__":
    # Modify this part to use your decoded notebook content
    decoded_notebook_content = "..."  # Your decoded content here

    # Import modified content to create a new notebook
    import_notebook(decoded_notebook_content)

    print("New notebook created successfully.")

Replace the placeholder "..." with your actual decoded notebook content. Execute the script, and it should create a new notebook in your Databricks workspace. If you encounter any issues or need further assistance, feel free to ask! 🚀