Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Currently, I am running a cluster that is set to terminate after 60 minutes of inactivity. However, in one of my notebooks, one of the cells is still running. How can I prevent this from happening, if want my notebook to run overnight without monito...
If a cell is already running ( I assume it's a streaming operation), then I think it doesn't mean that the cluster is inactive. The cluster should be running if a cell is running on it.On the other hand, if you want to keep running your clusters for ...
I've cloned a Repo during "Get Started with Data Engineering on Databricks".Then I'm trying to run another notebook from a cell with a magic %run command.But I get that the file can't be opened safe.Here my code:notebook_aname = "John"
print(f"Hello ...
Due to dependencies, if one of our cells errors then we want the notebook to stop executing.We've noticed some odd behaviour when executing notebooks depending on if "Run all cells in this notebook" is selected from the header versus "Run All Below"....
Has this been implemented? I have created a job using notebook. My notebook has 6 cells and if the code in first cell fails it should not run the rest of the cells
I'm building notebooks for tutorial sessions and I want to clear all the output results from the notebook before distributing it to the participants.
This functionality exists in Juypter but I can't find it in Databricks. Any pointers?
Yes! Run > Clear > Clear all cell outputs
Fun fact, this feature was made ~10 years ago when we realised all our customer demos looked very messy and had lots of spoilers in them!
In JupyterLab notebooks, we can --In edit mode, you can press Ctrl+Shift+Minus to split the current cell into two at the cursor position In command mode, you can click A or B to add a cell Above or Below the current cellare there equivalent shortcuts...
What's the status of the ctrl-alt-minus shortcut for splitting a cell? That keyboard combination does absolutely nothing in my interface (running Databricks via Chrome on GCP).
When running a notebook using dbutils.notebook.run from a master-notebook, an url to that running notebook is printed, i.e.:
Notebook job #223150 Notebook job #223151
Are there any ways to capture that Job Run ID (#223150 or #223151)? We have 50 or ...
I know this is an old thread, but sharing what is working for me well in Python now, for retrieving the run_id as well and building the entire link to that job run:job_id = dbutils.notebook.entry_point.getDbutils().notebook().getContext().jobId().get...
Hi ML folks,
We are using Databricks to train deep learning models. The code, however, has a complex structure of classes. This would work fine in a perfect bug-free world like Alice in Wonderland.
Debugging in Databricks is awkward. We ended up do...
Has this been solved yet; a mature way to debug code on databricks. I'm running in the same kind of issue.Variable explorer can be used and pdb, but not the same really..
Has anyone found a nice way to run code-formating (like black) on the notebooks **in the workspace**? My current workflow is to commit the file, pull it locally, format, repush and pull. It would be nice if it was some relatively easy way to run blac...
Hi Erik,I don't know if you are aware of this feature, currently there is an option to format the code in your databricks notebooks using the black code style formatter.Just you need to either have a version of your DBR equal to or greater than 11.2 ...
Hi guys,I have some notebooks with REPOS but I noticed that REPOS changed my notebook format to .py because of this my Azure Data Factory no longer recognizes the notebook (.py)Have any ideia to convert that .py to databricks format ?
that is odd. repos is merely another location (linked to git).You can copy/paste the code inside the py file into a notebook, or convert them using online tools or python libraries (like py2ipynb).
Hi,We are wondering if it is possible to disable the possibility to disable scheduling of a notebook. A client wants to allow many analysts access to databricks, but a concern is the possibility of setting schedules (the fastest is every minute!). Is...
Hi @Paulo Rijnberg Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedba...
So, in this case our jobs are deployed from our development workspace to our isolated testing workspace via an automated Azure DevOps pipeline. As such, they are created (and thus run as) a service account user.Recently we made the switch to using gi...
Hi @Chris Block Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...
To pass arguments/variables to a notebook, you can use a JSON file to temporarily store the arguments and then pass it as one argument to the notebook. After passing the JSON file to the notebook, you can parse it with json.loads(). The argument list...
I want to move notebooks , workflows , data from one users to another user in Azure Databricks. We move have access to that databricks. Is it possible? If, yes. How to move it.