Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
Hello,Do any of you have experience with using OCRmyPDF in Databricks? I have tried to install it in various was with different versions, but my notebook keep crashing with the error:The Python process exited with exit code 139 (SIGSEGV: Segmentation...
Hi all,I’m currently organizing a growing number of notebooks in our Databricks workspace and trying to keep things manageable with proper tagging and metadata. One idea I had was to automatically apply tags to notebooks based on their folder structu...
Hi @EllaClark, Yes, you can automate tagging of Databricks notebooks based on folder structure using the REST API and a script. Use the Workspace API to list notebook paths, extract folder names, and treat them as tags.If the API supports metadata up...
Dear all,I am upgrading DBR version from 9.1 LTS to 15.4 LTS in Azure Databricks. for that I have created a new cluster with 15.4 DBR attached init script for installing application dependencies. Cluster has started successfully but it takes 30 min. ...
Hey, this error usually happens when the cluster isn't fully ready before your application starts running. Since your init script takes about 30 minutes, it’s likely that your job starts before all dependencies are properly installed. The ModuleNotFo...
Hi, is there a simple way to sync a local notebook with a Databricks notebook? For example, is it possible to just connect to the Databricks kernel or something similar?I know there are IDE extensions for this, but unfortunately, they use the local d...
Hi @Kabi, as of my knowledge databricks doesn’t support directly connecting to Databricks kernel. However, here are practical ways to sync your local notebook with Databricks:You can use Git to version control your notebooks. Clone your repo into Dat...
HI Guys,I have a dashboard with main page where I have a base query and added a date time range widget and linked it to filter the base query , Now I have a Page 2 where i use a a different sumamrized query as a source , base query2 . I need this qu...
Hi @Mani2105, I guess currently, Databricks dashboards don’t support sharing widget parameters like date range filters across pages. Each page is isolated, so filters must be recreated manually per page. Manual configuration remains the only way to m...
We are trying to connect Databricks to OneLake, to read data from a Fabric workspace into Databricks, using a notebook. We also use Unity Catalog. We are able to read data from the workspace with a Service Principal like this:from pyspark.sql.types i...
I can find documentation to enable automatic liquid clustering with SQL code: CLUSTER BY AUTO. But how do I do this with Pyspark? I know I can do it with spark.sql("ALTER TABLE CLUSTER BY AUTO") but ideally I want to pass it as an .option().Thanks in...
To enable automatic liquid clustering with PySpark and pass it as an `.option()` during table creation or modification, you currently cannot directly use a `.clusterBy("AUTO")` method in PySpark's `DataFrameWriter` API. However, there are workarounds...
Are you kidding me here--I couldn't post this reply because (see arrows because I can't say the words)? I've run afoul of this several times before, bad word detection was a solved problem in the 1990s and there is even a term for errors like this--...
Hello @Rjdudley!
Thank you for bringing this to our attention. We understand how frustrating it can be to have your message incorrectly flagged, especially when you're contributing meaningfully. While our filters are in place to maintain a safe space...
How can I hide specific colum from a table visualization, but not in the exported data.I have over 200 columns in my query result and the ui freeze while I want to show it in a table visualization. So I want to hide specific columns, but if I export ...
Hey folks,Facing an issue with zone_id not getting overridden when redeploying the DAB template to Databricks workspace.The Databricks job is already deployed and has "ap-south-1a" zone_id. I wanted to make it "auto" so, I have made the changes to th...
Hi everyone,I am training some reinforcement learning models and I am trying to automate the hyperparameter search using optuna. I saw in the documentation that you can use joblib with spark as a backend to train in paralel. I got that working with t...
Hi there, in the past I've posted questions in this community and I would consistently get responses back in a very reasonable time frame. Typically I think most of my posts have an initial response back within 1-2 days, or just a few days (I don't t...
Thank you for clarifying. I know some questions may be a bit more technical, but I hope I get some feedback/suggestions, particularly to my UMF Best Practice question!
Hi,I have a notebook with two input widgets set up ("current_month" and "current_year") that the notebook grabs values from and uses for processing. I want to be able to provide a list of input values in the "for each" task where each value is actual...
Hi there @sys08001 , yup it is possible you can pass the input values for the for_each task in json format Somewhat like this[
{
"tableName": "product_2",
"id": "1",
"names": "John Doe",
"created_at": "2025-02-22T10:00:00.000Z"
},...
I am going through a course learning Azure Databricks and I had created a new Azure Databricks Workspace. I am the owner of the subscription and created everything so I assume I should have full admin rights.The following is my set up-Azure Student S...
@rodneyc8063 According to Azure’s documentation, it states:Tip:Admins: If you’re unable to enable Databricks Assistant, you might need to disable the "Enforce data processing within workspace Geography for Designated Services" setting. See “For an ac...
I am looking for some help on getting databricks cluster metrics such as memory utilization, CPU utilization, memory swap utilization, free file system using REST API.I am trying it in postman using databricks token and with my Service Principal bear...