Get Started Discussions

by kro • New Contributor II

10-21-2024 1:34:13 AM

612 Views
2 replies
2 kudos

OCRmyPDF in Databricks

Hello,Do any of you have experience with using OCRmyPDF in Databricks? I have tried to install it in various was with different versions, but my notebook keep crashing with the error:The Python process exited with exit code 139 (SIGSEGV: Segmentation...

Get Started Discussions

ocr

ocrmypdf

pdf

segmentation fault

tesseract

Reply

612 Views
2 replies
2 kudos

10-21-2024 1:34:13 AM

View Replies

Latest Reply

sridharplv
Contributor

Tuesday

2 kudos

Refer to this link too https://community.databricks.com/t5/data-engineering/pdf-parsing-in-notebook/td-p/14636

2 kudos

Tuesday

1 More Replies

by EllaClark • New Contributor II

2 weeks ago

417 Views
2 replies
0 kudos

Can I automate notebook tagging based on workspace folder structure?

Hi all,I’m currently organizing a growing number of notebooks in our Databricks workspace and trying to keep things manageable with proper tagging and metadata. One idea I had was to automatically apply tags to notebooks based on their folder structu...

Get Started Discussions

Reply

417 Views
2 replies
0 kudos

2 weeks ago

View Replies

Latest Reply

Renu_
New Contributor III

Tuesday

0 kudos

Hi @EllaClark, Yes, you can automate tagging of Databricks notebooks based on folder structure using the REST API and a script. Use the Workspace API to list notebook paths, extract folder names, and treat them as tags.If the API supports metadata up...

0 kudos

Tuesday

1 More Replies

by Kuchnhi • New Contributor III

2 weeks ago

783 Views
10 replies
6 kudos

Facing issues while upgrading DBR version from 9.1 LTS to 15.4 LTS

Dear all,I am upgrading DBR version from 9.1 LTS to 15.4 LTS in Azure Databricks. for that I have created a new cluster with 15.4 DBR attached init script for installing application dependencies. Cluster has started successfully but it takes 30 min. ...

Get Started Discussions

Reply

783 Views
10 replies
6 kudos

2 weeks ago

View Replies

Latest Reply

SmithPoll
New Contributor II

Tuesday

6 kudos

Hey, this error usually happens when the cluster isn't fully ready before your application starts running. Since your init script takes about 30 minutes, it’s likely that your job starts before all dependencies are properly installed. The ModuleNotFo...

6 kudos

Tuesday

9 More Replies

by Kabi • New Contributor II

a week ago

211 Views
1 replies
1 kudos

Resolved! Simple notebook sync

Hi, is there a simple way to sync a local notebook with a Databricks notebook? For example, is it possible to just connect to the Databricks kernel or something similar?I know there are IDE extensions for this, but unfortunately, they use the local d...

Get Started Discussions

Reply

211 Views
1 replies
1 kudos

a week ago

View Replies

Latest Reply

Renu_
New Contributor III

a week ago

1 kudos

Hi @Kabi, as of my knowledge databricks doesn’t support directly connecting to Databricks kernel. However, here are practical ways to sync your local notebook with Databricks:You can use Git to version control your notebooks. Clone your repo into Dat...

1 kudos

a week ago

by Mani2105 • New Contributor II

a week ago

148 Views
1 replies
1 kudos

Databricks Dashboard ,passing Prompt Values from one page to another

HI Guys,I have a dashboard with main page where I have a base query and added a date time range widget and linked it to filter the base query , Now I have a Page 2 where i use a a different sumamrized query as a source , base query2 . I need this qu...

Get Started Discussions

Reply

148 Views
1 replies
1 kudos

a week ago

View Replies

Latest Reply

Renu_
New Contributor III

a week ago

1 kudos

Hi @Mani2105, I guess currently, Databricks dashboards don’t support sharing widget parameters like date range filters across pages. Each page is isolated, so filters must be recreated manually per page. Manual configuration remains the only way to m...

1 kudos

a week ago

by Judith • New Contributor III

02-18-2025 7:05:35 AM

1045 Views
1 replies
1 kudos

Connect to Onelake using Service Principal, Unity Catalog and Databricks Access Connector

We are trying to connect Databricks to OneLake, to read data from a Fabric workspace into Databricks, using a notebook. We also use Unity Catalog. We are able to read data from the workspace with a Service Principal like this:from pyspark.sql.types i...

Get Started Discussions

Reply

1045 Views
1 replies
1 kudos

02-18-2025 7:05:35 AM

View Replies

Latest Reply

behema1074
New Contributor II

a week ago

1 kudos

Hi, I am facing the same problem. Have you already been able to solve the problem?

1 kudos

a week ago

by htd350 • New Contributor II

2 weeks ago

279 Views
1 replies
2 kudos

Resolved! Cluster by auto pyspark

I can find documentation to enable automatic liquid clustering with SQL code: CLUSTER BY AUTO. But how do I do this with Pyspark? I know I can do it with spark.sql("ALTER TABLE CLUSTER BY AUTO") but ideally I want to pass it as an .option().Thanks in...

Get Started Discussions

Reply

279 Views
1 replies
2 kudos

2 weeks ago

View Replies

Latest Reply

BigRoux
Databricks Employee

a week ago

2 kudos

To enable automatic liquid clustering with PySpark and pass it as an `.option()` during table creation or modification, you currently cannot directly use a `.clusterBy("AUTO")` method in PySpark's `DataFrameWriter` API. However, there are workarounds...

2 kudos

a week ago

by Rjdudley • Honored Contributor

2 weeks ago

224 Views
2 replies
1 kudos

Asinine bad word detection

Are you kidding me here--I couldn't post this reply because (see arrows because I can't say the words)? I've run afoul of this several times before, bad word detection was a solved problem in the 1990s and there is even a term for errors like this--...

Get Started Discussions

Reply

224 Views
2 replies
1 kudos

2 weeks ago

View Replies

Latest Reply

Advika
Databricks Employee

2 weeks ago

1 kudos

Hello @Rjdudley! Thank you for bringing this to our attention. We understand how frustrating it can be to have your message incorrectly flagged, especially when you're contributing meaningfully. While our filters are in place to maintain a safe space...

1 kudos

2 weeks ago

1 More Replies

by tw1 • New Contributor III

2 weeks ago

238 Views
5 replies
1 kudos

AI/BI Dashboard - Hide Column in Table Visualization, but not in exported data

How can I hide specific colum from a table visualization, but not in the exported data.I have over 200 columns in my query result and the ui freeze while I want to show it in a table visualization. So I want to hide specific columns, but if I export ...

Get Started Discussions

Reply

238 Views
5 replies
1 kudos

2 weeks ago

View Replies

Latest Reply

tw1
New Contributor III

2 weeks ago

1 kudos

.

1 kudos

2 weeks ago

4 More Replies

by tejas8196 • New Contributor II

06-04-2024 12:13:03 PM

1807 Views
3 replies
0 kudos

DAB not updating zone_id when redeployed

Hey folks,Facing an issue with zone_id not getting overridden when redeploying the DAB template to Databricks workspace.The Databricks job is already deployed and has "ap-south-1a" zone_id. I wanted to make it "auto" so, I have made the changes to th...

Screenshot 2024-06-05 at 12.40.57 AM.png

Get Started Discussions

data engineering

Reply

1807 Views
3 replies
0 kudos

06-04-2024 12:13:03 PM

View Replies

Latest Reply

KungFuMaster
New Contributor II

2 weeks ago

0 kudos

Hello. I had similar issue when using DAB in CI/CD but able to fixed the issue.

0 kudos

2 weeks ago

2 More Replies

by mancosta • New Contributor

2 weeks ago

174 Views
0 replies
0 kudos

Joblib with optuna and SB3

Hi everyone,I am training some reinforcement learning models and I am trying to automate the hyperparameter search using optuna. I saw in the documentation that you can use joblib with spark as a backend to train in paralel. I got that working with t...

Get Started Discussions

Reply

174 Views
0 replies
0 kudos

2 weeks ago

by ChristianRRL • Valued Contributor

3 weeks ago

533 Views
2 replies
1 kudos

Resolved! DBX Community Pending Answers

Hi there, in the past I've posted questions in this community and I would consistently get responses back in a very reasonable time frame. Typically I think most of my posts have an initial response back within 1-2 days, or just a few days (I don't t...

Get Started Discussions

Reply

533 Views
2 replies
1 kudos

3 weeks ago

View Replies

Latest Reply

ChristianRRL
Valued Contributor

2 weeks ago

1 kudos

Thank you for clarifying. I know some questions may be a bit more technical, but I hope I get some feedback/suggestions, particularly to my UMF Best Practice question!

1 kudos

2 weeks ago

1 More Replies

by sys08001 • New Contributor II

2 weeks ago

380 Views
1 replies
1 kudos

Resolved! Is there a way to iterate over a combination of parameters using a "for each" task?

Hi,I have a notebook with two input widgets set up ("current_month" and "current_year") that the notebook grabs values from and uses for processing. I want to be able to provide a list of input values in the "for each" task where each value is actual...

Get Started Discussions

Reply

380 Views
1 replies
1 kudos

2 weeks ago

View Replies

Latest Reply

ashraf1395
Honored Contributor

2 weeks ago

1 kudos

Hi there @sys08001 , yup it is possible you can pass the input values for the for_each task in json format Somewhat like this[ { "tableName": "product_2", "id": "1", "names": "John Doe", "created_at": "2025-02-22T10:00:00.000Z" },...

1 kudos

2 weeks ago

by rodneyc8063 • New Contributor II

4 weeks ago

691 Views
2 replies
0 kudos

Azure Databricks - Databricks AI Assistant missing on Azure Student Subscription?

I am going through a course learning Azure Databricks and I had created a new Azure Databricks Workspace. I am the owner of the subscription and created everything so I assume I should have full admin rights.The following is my set up-Azure Student S...

Get Started Discussions

Reply

691 Views
2 replies
0 kudos

4 weeks ago

View Replies

Latest Reply

Takuya-Omi
Valued Contributor III

2 weeks ago

0 kudos

@rodneyc8063 According to Azure’s documentation, it states:Tip:Admins: If you’re unable to enable Databricks Assistant, you might need to disable the "Enforce data processing within workspace Geography for Designated Services" setting. See “For an ac...

0 kudos

2 weeks ago

1 More Replies

by Prashanthkumar • New Contributor III

01-12-2024 4:09:54 PM

7856 Views
12 replies
2 kudos

Is it possible to view Databricks cluster metrics using REST API

I am looking for some help on getting databricks cluster metrics such as memory utilization, CPU utilization, memory swap utilization, free file system using REST API.I am trying it in postman using databricks token and with my Service Principal bear...

Get Started Discussions

Reply

7856 Views
12 replies
2 kudos

01-12-2024 4:09:54 PM

View Replies

Latest Reply

Prashanthkumar
New Contributor III

2 weeks ago

2 kudos

Thank you Walter, will keep an eye.

2 kudos

2 weeks ago

11 More Replies

Databricks Community

Forum Posts

OCRmyPDF in Databricks

Can I automate notebook tagging based on workspace folder structure?

Facing issues while upgrading DBR version from 9.1 LTS to 15.4 LTS

Resolved! Simple notebook sync

Databricks Dashboard ,passing Prompt Values from one page to another

Connect to Onelake using Service Principal, Unity Catalog and Databricks Access Connector

Resolved! Cluster by auto pyspark

Asinine bad word detection

AI/BI Dashboard - Hide Column in Table Visualization, but not in exported data

DAB not updating zone_id when redeployed

Joblib with optuna and SB3

Resolved! DBX Community Pending Answers

Resolved! Is there a way to iterate over a combination of parameters using a "for each" task?

Azure Databricks - Databricks AI Assistant missing on Azure Student Subscription?

Is it possible to view Databricks cluster metrics using REST API

Join Us as a Local Community Builder!

using Azure Databricks vs using Databricks directl...

Simple notebook sync

Cluster by auto pyspark

DBX Community Pending Answers

Is there a way to iterate over a combination of pa...