04-20-2023 01:45 AM
I have a workflow which will run every month and it will create a new notebook containing the outputs from the main notebook. However, after some time, the outputs from the created notebook will disappear. Is there anyway I can retain the outputs?
04-24-2023 08:47 PM
@Shaun Ang :
There are a few possible reasons why the outputs from the created notebook might be disappearing:
To retain the outputs, you can try the following:
I hope this helps, and please let me know if you have any further questions or concerns.
04-20-2023 02:43 AM
To follow up with the discussion, when the new notebook with command outputs is created, it shows that the revision history is empty and it has a pending revision. I have to manually click save for the outputs to stay. Is there a way that I can automatically save the revision from the workflow such that the outputs can be retained?
04-24-2023 08:49 PM
@Shaun Ang :
Yes, you can use the Databricks Workspace API to programmatically save the revision of the created notebook, without the need for manual intervention.
You can use the workspace object in the Databricks Python API to create a new revision of the notebook and save its contents. Here's an example code snippet that shows how to do this:
import requests
import json
from databricks_cli.sdk.api_client import ApiClient
from databricks_cli.workspace.api import WorkspaceApi
# Set up the Databricks API client
api_client = ApiClient(token=dbutils.secrets.get(scope="<scope>", key="<key>"))
workspace_api = WorkspaceApi(api_client)
# Create a new revision of the notebook
notebook_path = "/path/to/new/notebook"
notebook_name = "new_notebook_name"
notebook_content = dbutils.fs.head(notebook_path)
notebook = workspace_api.import_workspace(
notebook_name,
format="SOURCE",
language="PYTHON",
content=json.dumps({"content": notebook_content})
)
# Save the new revision of the notebook
notebook_revision = notebook["object_id"]
workspace_api.save(notebook_path, revision=notebook_revision)
In this example, we first set up the Databricks API client using an API token retrieved from the Databricks Secrets API. We then create a new revision of the notebook by calling the
import_workspace method of the WorkspaceApi object, which takes the name of the new notebook, the format of the content (in this case, "SOURCE" for a notebook file), the language of the notebook (in this case, "PYTHON"), and the contents of the notebook file as a JSON object. We then retrieve the object ID of the new notebook from the response of
import_workspace.
Finally, we save the new revision of the notebook using the save method of the WorkspaceApi object, which takes the path of the notebook and the object ID of the new revision.
Note that you'll need to replace <scope> and <key> in the dbutils.secrets.get method with the appropriate scope and key names for your Databricks environment.
I hope this helps, and please let me know if you have any further questions or concerns.
04-24-2023 08:47 PM
@Shaun Ang :
There are a few possible reasons why the outputs from the created notebook might be disappearing:
To retain the outputs, you can try the following:
I hope this helps, and please let me know if you have any further questions or concerns.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group