โ02-10-2022 01:45 PM
โ02-10-2022 04:39 PM
Using workspace API you can list out all the notebooks for a given user.
The API response will tell you if the objects under the path is a folder or a notebook.
If it's a folder then you can add it to the path and get notebooks within the folder.
Put all of that in an excel or something and ask your team members if they need a notebook or not.
GET https://<databricks-host-name>/api/2.0/workspace/list
Body:
{ "path": "/Users/<username>" }
Refer to this documentation for more details.
Also, for a period of 'x' months archive them all in a github repo, in case someone needs access to notebooks later.
Going ahead, add sufficient logs in the notebook or a mechanism to record execution time.
It could be as simple as an insert statement at top cell that inserts a row in a table default.notebook-run with values notebook-name and timestamp, every time a notebook runs.
โ02-10-2022 01:49 PM
You can see when a notebook was last run if it's attached to an active cluster. You can also read old logs to see what happened, but it's a lot of work for almost no gain. There isn't any harm in having old notebooks that aren't run. I have some notebooks in a workspace I have never run once and it's not problematic
โ02-10-2022 01:54 PM
Hi Josephk, I'm new to databricks, but I've been asked to clean up old notebooks in our environment that have been created over the years, and are no longer used. Is there an API I can use to find when the last time a notebook was run? Or any other suggestion you have?
โ02-10-2022 03:56 PM
I looked around internally and couldn't find anything. Certainly nothing in the docs. Maybe just try deleting things and seeing if people complain?
โ02-10-2022 04:39 PM
Using workspace API you can list out all the notebooks for a given user.
The API response will tell you if the objects under the path is a folder or a notebook.
If it's a folder then you can add it to the path and get notebooks within the folder.
Put all of that in an excel or something and ask your team members if they need a notebook or not.
GET https://<databricks-host-name>/api/2.0/workspace/list
Body:
{ "path": "/Users/<username>" }
Refer to this documentation for more details.
Also, for a period of 'x' months archive them all in a github repo, in case someone needs access to notebooks later.
Going ahead, add sufficient logs in the notebook or a mechanism to record execution time.
It could be as simple as an insert statement at top cell that inserts a row in a table default.notebook-run with values notebook-name and timestamp, every time a notebook runs.
โ02-16-2022 10:48 AM
Just wondering...I can display 'Recent Activity' for a notebook--which gives me the information I'm looking for. So it is being collected...someplace. I can't find it in the APIs. Anyplace else I could look for that info?
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group