cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Rewrite Notebooks Programatically

Avinash_Narala
New Contributor III

Hello,

I want to refactor the notebook programatically. So, written the code as follows:

 

import requests
import base64

# Databricks Workspace API URLs
workspace_url = f"{host}/api/2.0/workspace"
export_url = f"{workspace_url}/export"
import_url = f"{workspace_url}/import"

# Databricks personal access token
token = "***********************"

# Notebook path
notebook_path = "/Workspace/Users/Avinash/Old_Notebook"
new_notebook_path = "/Workspace/Users/Avinash/New_notebook"

# Function to export notebook content
def export_notebook():
    export_payload = {
        "path": notebook_path,
        "format": "SOURCE"
    }
    response = requests.get(export_url, headers=headers, json=export_payload)
    response.raise_for_status()
    return response.json()["content"]

# Function to refactor notebook content
def refactor_notebook(content😞
    # Example: Replace all occurrences of 'old_variable' with 'new_variable'
    new_content = content.replace("hive_metastore", "test_catalog_ph")
    return new_content

# Function to import modified notebook content
def import_notebook(new_content):
    import_payload = {
        "path": new_notebook_path,
        "content": new_content,
        # "overwrite": True
    }
    response = requests.post(import_url, headers=headers, json=import_payload)
    if response.status_code != 200:
        print(response.content)
    response.raise_for_status()
 
# Main script
if __name__ == "__main__":
    # Export notebook content
    notebook_content = export_notebook()
    decoded_content = base64.b64decode(notebook_content).decode('utf-8')

    # print(decoded_content)
    # Refactor notebook content
    new_notebook_content = refactor_notebook(decoded_content)

    encoded_content=(base64.b64encode(new_notebook_content.encode('utf-8')))

    #print(encoded_content)
    #print(base64.b64decode(encoded_content).decode('utf-8'))

    # Import modified content back to Databricks
    import_notebook(encoded_content)

    print("Notebook refactoring complete.")

 

 

I am able to rewrite the content by exporting the notebook as Json. But unable to import the notebook from the Json to my workspace and getting the error as:

HTTPError: 400 Client Error: Bad Request for url.

Can you please help me with this

8 REPLIES 8

Kaniz
Community Manager
Community Manager

Hi @Avinash_NaralaThe HTTPError 400 indicates a Bad Request, which means there might be an issue with the request you’re making to the Databricks API.

Here are some steps to help you resolve this:

  1. Check the Request Payload:

    • Ensure that the payload you’re sending in the import_notebook function is correctly formatted.
    • Verify that the content, path, language, and other parameters are set appropriately.
    • Double-check the JSON structure and make sure it matches the expected format for importing a notebook.
  2. Content Encoding:

    • You’ve encoded the notebook content using base64. Make sure that the encoding and decoding process is consistent.
    • Confirm that the data variable contains the base64-encoded content of the modified notebook.
  3. Path and Overwrite:

    • Verify that the new_notebook_path is a valid path in your Databricks workspace.
    • If you’re overwriting an existing notebook, uncomment the "overwrite": True line in your import_notebook function.
  4. Permissions and Workspace Configuration:

    • Ensure that your Databricks personal access token (TOKEN) has the necessary permissions to create or overwrite notebooks in the specified path.
    • Check if there are any workspace-specific configurations or restrictions that might affect the import process.
  5. Use Databricks CLI API (Recommended):

    • Consider using the Databricks CLI API for importing notebooks. It simplifies the process and provides better error handling.
    • Install the Databricks CLI (pip install databricks-cli) and use the import_workspace method.

If you continue to face issues, feel free to provide additional details or error messages, and we’ll work through them together! 🚀

Avinash_Narala
New Contributor III

Hello Kaniz,

I've implemented steps(1-4) as you instructed, but the issue I am facing is while importing my notebook.

In detaiI, I can export my notebook as json an can work with the json(replacing some specific words), but while uploading that json as notebook I am not able to do it.

I tried in two ways,

1.overwrite the existing notebook.

2.creating new notebook.

So, If you provide How can I create a notebook in my databricks workspace with the json content I have. It will be really helpful .

Thank you.

Hi @Kaniz,

any update on this?

Kaniz
Community Manager
Community Manager

Hi @Avinash_NaralaLet’s create a new notebook in your Databricks workspace using the modified JSON content you have. Below are the steps to achieve this programmatically:

  1. Create a New Notebook:

    • To create a new notebook, you’ll need to use the Databricks REST API.
    • Make sure you have the necessary permissions to create notebooks in your workspace.
  2. Prepare the Import Payload:

    • The import_payload dictionary should include the following parameters:
      • "path": The path where you want to create the new notebook (e.g., "/Workspace/Users/Avinash/New_notebook").
      • "content": The modified notebook content in base64-encoded JSON format.
      • Optionally, you can include "overwrite": True if you want to overwrite an existing notebook with the same path.
  3. Send a POST Request:

    • Use the requests.post() method to send a POST request to the Databricks import URL (import_url).
    • Include the headers with the authorization token (token).
  4. Handle the Response:

    • Check the response status code. If it’s 200, the notebook was successfully created.
    • If there’s an error (e.g., 400 Bad Request), print the response content to debug the issue.

Here’s an updated version of your script with the necessary modifications for creating a new notebook:

import requests
import base64

# Databricks Workspace API URLs
workspace_url = f"{host}/api/2.0/workspace"
import_url = f"{workspace_url}/import"

# Databricks personal access token
token = "***********************"

# New notebook path
new_notebook_path = "/Workspace/Users/Avinash/New_notebook"

# Function to import modified notebook content
def import_notebook(new_content):
    import_payload = {
        "path": new_notebook_path,
        "content": new_content,
        # Uncomment the line below if you want to overwrite an existing notebook
        # "overwrite": True
    }
    response = requests.post(import_url, headers=headers, json=import_payload)
    if response.status_code != 200:
        print(response.content)
    response.raise_for_status()

# Main script
if __name__ == "__main__":
    # ... (Your existing code for exporting and modifying the notebook content)

    # Refactor notebook content
    new_notebook_content = refactor_notebook(decoded_content)

    # Encode the modified content
    encoded_content = base64.b64encode(new_notebook_content.encode('utf-8'))

    # Import modified content back to Databricks
    import_notebook(encoded_content)

    print("Notebook refactoring completed successfully!")

Make sure to replace the placeholders (host, new_notebook_path, etc.) with your actual values. If you encounter any issues, check the response content for more details. Good luck, and feel free to ask if you need further assistance! 😊

Avinash_Narala
New Contributor III

Same error message:

Attaching the error message

Hi @Avinash_Narala , print the response content to see if there are any additional details. Use print(response.content) after the import request to get more information about the issue.

Avinash_Narala
New Contributor III

Hi @Kaniz,

Attaching the response.content

 

Hi @Avinash_NaralaThank you for sharing the response content.

Let’s address the issue you’re facing while importing the notebook from JSON.

The error message you received indicates a Bad Request. To troubleshoot this, let’s focus on the import process. Here are some steps to help you create a new notebook using the provided JSON content:

  1. Decode the Base64 Content:

    • First, ensure that you’ve correctly decoded the base64-encoded content from the response. The decoded content should represent a valid Databricks notebook in JSON format.
  2. Create a New Notebook:

    • Use the Databricks Workspace API to create a new notebook.
    • Set the new_notebook_path variable to the desired path where you want to create the new notebook.
    • Make sure the path is unique and does not conflict with existing notebooks.
  3. HTTP Request for Creating a New Notebook:

    • Send an HTTP POST request to the Databricks API endpoint for creating a new notebook (import_url).
    • Include the following parameters in the request payload:
      • "path": The notebook path where you want to create the new notebook.
      • "content": The decoded content of the notebook in JSON format.
  4. Optional: Overwrite Existing Notebook:

    • If you intend to overwrite an existing notebook with the same path, uncomment the "overwrite": True line in the request payload.
  5. Inspect the Response:

    • Capture the response content when making the API request. Print it out to get more details about the issue.
    • Modify the import_notebook function to print the response content (similar to what we did earlier).
  6. Check Authentication and Permissions:

    • Ensure that your personal access token (token) is valid and has the necessary permissions to create notebooks.
    • Verify that the token is correctly included in the request headers.

Here’s a modified version of the script to create a new notebook using the provided content:

import requests
import base64

# Databricks Workspace API URLs
workspace_url = f"{host}/api/2.0/workspace"
import_url = f"{workspace_url}/import"

# Databricks personal access token
token = "***********************"

# New notebook path
new_notebook_path = "/Workspace/Users/Avinash/New_notebook"

# Function to import notebook content
def import_notebook(new_content):
    import_payload = {
        "path": new_notebook_path,
        "content": new_content,
        # "overwrite": True  # Uncomment if you want to overwrite an existing notebook
    }
    response = requests.post(import_url, headers=headers, json=import_payload)
    if response.status_code != 200:
        print(response.content)  # Print the response content
    response.raise_for_status()

# Main script
if __name__ == "__main__":
    # Modify this part to use your decoded notebook content
    decoded_notebook_content = "..."  # Your decoded content here

    # Import modified content to create a new notebook
    import_notebook(decoded_notebook_content)

    print("New notebook created successfully.")

Replace the placeholder "..." with your actual decoded notebook content. Execute the script, and it should create a new notebook in your Databricks workspace. If you encounter any issues or need further assistance, feel free to ask! 🚀

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!