Data Engineering

Forum Posts

Sorted by:

by Oliver_Angelil • Valued Contributor II

04-26-2023 8:51:11 AM

10713 Views
9 replies
1 kudos

How to use the git CLI in databricks?

After making some changes in my feature branch, I have committed and pushed (to Azure Devops) some work (note I have not yet raised a PR or merge to any other branch). Many of the files I committed are data files and so I would like to reverse the co...

Data Engineering

10713 Views
9 replies
1 kudos

04-26-2023 8:51:11 AM

View Replies

Latest Reply

AntonDBUser
New Contributor III

10-21-2024 4:24:59 AM

1 kudos

Any updates on this? We still can't manage to run Git CLI commands from Databricks. Appreciate any input on this!

1 kudos

10-21-2024 4:24:59 AM

8 More Replies

by dg • New Contributor II

10-20-2021 2:30:38 AM

18661 Views
7 replies
3 kudos

Trying to use pdf2image on databricks

Trying to use pdf2image on databricks, but its failing with "PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?"I've installed pdf2image & poppler-utils by running the following in a cell:%pip install pdf2image%pip ...

Data Engineering

18661 Views
7 replies
3 kudos

10-20-2021 2:30:38 AM

View Replies

Latest Reply

Slalom_Tobias
New Contributor III

03-27-2024 12:34:24 PM

3 kudos

Seems like this thread has died, but for posterity, databricks provides the following code for installing poppler on a cluster. The code is sourced from the dbdemos accelerators, specifically the "LLM Chatbot With Retrieval Augmented Generation (RAG)...

3 kudos

03-27-2024 12:34:24 PM

6 More Replies

by houstonamoeba • New Contributor III

05-09-2023 4:00:52 PM

4746 Views
7 replies
1 kudos

Resolved! examples on python sdk for install libraries

Hi Everyone,I'm planning to use databricks python cli "install_libraries"can some one pls post examples on function install_libraries https://github.com/databricks/databricks-cli/blob/main/databricks_cli/libraries/api.py

Data Engineering

4746 Views
7 replies
1 kudos

05-09-2023 4:00:52 PM

View Replies

Latest Reply

Loop-Insist
New Contributor II

08-25-2023 7:15:54 AM

1 kudos

Here you go using Python SDKfrom databricks.sdk import WorkspaceClientfrom databricks.sdk.service import computew = WorkspaceClient(host="yourhost", token="yourtoken")# Create an array of Library objects to be installedlibraries_to_install = [compute...

1 kudos

08-25-2023 7:15:54 AM

6 More Replies

by superanna • New Contributor II

06-26-2022 3:17:36 AM

1005 Views
1 replies
1 kudos

Yes, still illegal. And I also don’t understand why it is equated with drugs, but alcohol is not! Not a single murder has yet been committed under can...

Yes, still illegal. And I also don’t understand why it is equated with drugs, but alcohol is not! Not a single murder has yet been committed under cannabis, not a single war has been unleashed. It's just that people who don't use don't understand how...

Data Engineering

1005 Views
1 replies
1 kudos

06-26-2022 3:17:36 AM

View Replies

Latest Reply

Mz_Yvette
New Contributor II

07-16-2023 5:18:41 AM

1 kudos

You are absolutely right! I have found it to be a big relief medically. I have nerve conditions which is not operable. The legal medical pills almost literally killed me, and if it wasn't for my husband's quick thinking, I wouldn't be here to share t...

1 kudos

07-16-2023 5:18:41 AM

by THIAM_HUATTAN • Valued Contributor

05-08-2023 5:04:00 PM

1347 Views
1 replies
0 kudos

community.cloud.databricks.com

https://community.cloud.databricks.com/Last night I am still able to use it. This morning breaks down totally, and I could not login. Please help.

Data Engineering

1347 Views
1 replies
0 kudos

05-08-2023 5:04:00 PM

View Replies

Latest Reply

Anonymous
Not applicable

05-08-2023 10:12:47 PM

0 kudos

Hi @THIAM HUAT TAN Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not res...

0 kudos

05-08-2023 10:12:47 PM

by Abel_Martinez • Contributor

11-23-2022 1:40:12 AM

20021 Views
10 replies
39 kudos

Why Python logs shows [REDACTED] literal in spaces when I use dbutils.secrets.get in my code?

When I use dbutils.secrets.get in my code, spaces in the log are replaced by "[REDACTED]" literal. This is very annoying and makes the log reading difficult. Any idea how to avoid this?See my screenshot...

Data Engineering

20021 Views
10 replies
39 kudos

11-23-2022 1:40:12 AM

View Replies

Latest Reply

jlb0001
New Contributor III

04-25-2023 10:00:19 AM

39 kudos

I ran into the same issue and found that the reason was that the notebook included some test keys with values of "A" and "B" for simple testing. I noticed that any string with a substring of "A" or "B" was "[REDACTED]".So, in my case, it was an eas...

39 kudos

04-25-2023 10:00:19 AM

9 More Replies

by testname1 • New Contributor II

03-26-2023 2:36:16 PM

2313 Views
1 replies
1 kudos

Is it possible to use the databricks-sql-nodejs driver in a create-react-app app?

I'm using the typescript example for the databricks sql driver but I'm getting errors when compiling:

Data Engineering

2313 Views
1 replies
1 kudos

03-26-2023 2:36:16 PM

View Replies

Latest Reply

User16502773013
Databricks Employee

04-20-2023 4:01:23 PM

1 kudos

Hello @asdf fdsa ,The NodeJS connector is built for NodeJS environment it will not integrate ReactJSFor cases where a web execution is needed we advise to use SQL Exec APIPlease check documentation here for the same:https://docs.databricks.com/sql/a...

1 kudos

04-20-2023 4:01:23 PM

by alejandrofm • Valued Contributor

03-30-2023 2:01:09 PM

3471 Views
2 replies
1 kudos

Understand if the configs I use to SparkSession.builder still make sense for Databricks 10+

Hi! I currently have this as an old generic template with amends over time to optimize Databricks Spark execution, can you help me to know if this still makes sense for v10-11-12 or if there are new recommendations? Maybe some of this is making my pr...

Data Engineering

3471 Views
2 replies
1 kudos

03-30-2023 2:01:09 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-02-2023 9:35:11 AM

1 kudos

@Alejandro Martinez :Hi! Your template seems to be a good starting point for configuring a SparkSession in Databricks. However, there are some new recommendations that you can consider for Databricks runtime versions v10-11-12. Here are some suggest...

1 kudos

04-02-2023 9:35:11 AM

1 More Replies

by KellenO • New Contributor II

12-08-2022 12:30:22 PM

2496 Views
2 replies
8 kudos

Resolved! How can I use cluster autoscaling with intensive subprocess calls?

I have a custom application/executable that I upload to DBFS and transfer to my cluster's local storage for execution. I want to call multiple instances of this application in parallel, which I've only been able to successfully do with Python's subpr...

Data Engineering

2496 Views
2 replies
8 kudos

12-08-2022 12:30:22 PM

View Replies

Latest Reply

Anonymous
Not applicable

12-08-2022 4:18:17 PM

8 kudos

Autoscaling works for spark jobs only. It works by monitoring the job queue, which python code won't go into. If it's just python code, try single node.https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling

8 kudos

12-08-2022 4:18:17 PM

1 More Replies

by Trey • New Contributor III

09-18-2022 10:30:13 PM

2056 Views
2 replies
3 kudos

Where do you usually store and manage "JDBC credentials" to use on databricks notebook?

Hi all,I would like to improve the way I use JDBC credenditial information (ID/PW, host, port, etc)Where do you guys usually store and use the jdbc credentials?Thanks for your help in advance!

Data Engineering

2056 Views
2 replies
3 kudos

09-18-2022 10:30:13 PM

View Replies

Latest Reply

Anonymous
Not applicable

10-02-2022 3:55:20 AM

3 kudos

Hi @Kwangwon Yi Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

3 kudos

10-02-2022 3:55:20 AM

1 More Replies

by 109005 • New Contributor III

09-02-2022 12:14:02 AM

3098 Views
6 replies
0 kudos

Does Databricks support proxy for BigQuery?

Hi team, we tried to use the proxy options for BigQuery Spark connector as mentioned in this documentation. However, we keep getting "connect timed out" error. The proxy host is working on our end. This made us wonder if by chance Databricks does not...

Data Engineering

3098 Views
6 replies
0 kudos

09-02-2022 12:14:02 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-20-2022 9:58:15 PM

0 kudos

Hi @Ayushi Pandey Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

0 kudos

09-20-2022 9:58:15 PM

5 More Replies

by BradSheridan • Valued Contributor

07-27-2022 6:13:27 AM

4720 Views
9 replies
4 kudos

Resolved! How to use cloudFiles to completely overwrite the target

Hey there Community!! I have a client that will produce a CSV file daily that needs to be moved from Bronze -> Silver. Unfortunately, this source file will always be a full set of data....not incremental. I was thinking of using AutoLoader/cloudFil...

Data Engineering

4720 Views
9 replies
4 kudos

07-27-2022 6:13:27 AM

View Replies

Latest Reply

BradSheridan
Valued Contributor

08-12-2022 10:44:42 AM

4 kudos

I "up voted'" all of @werners suggestions b/c they are all very valid ways of addressing my need (the true power/flexibility of the Databricks UDAP!!!). However, turns out I'm going to end up getting incremental data afterall :). So now the flow wi...

4 kudos

08-12-2022 10:44:42 AM

8 More Replies

by ESG • New Contributor II

05-27-2022 7:42:50 AM

690 Views
0 replies
1 kudos

The (ESGRI) introduces the world’s most wanted environmental social and governance cutting edge easy to use software tools with next-gen tech.

Data Engineering

690 Views
0 replies
1 kudos

05-27-2022 7:42:50 AM

by lecardozo • New Contributor II

02-17-2022 7:22:57 AM

5513 Views
5 replies
1 kudos

Resolved! Problems with HiveMetastoreClient and internal Databricks Metastore.

I've been trying to use the HiveMetastoreClient class in Scala to extract some metadata from Databricks internal Metastore, without success. I'm currently using the 7.3 LTS runtime.The error seems to be related to some kind of inconsistency between...

Data Engineering

5513 Views
5 replies
1 kudos

02-17-2022 7:22:57 AM

View Replies

Latest Reply

lecardozo
New Contributor II

03-04-2022 9:28:58 AM

1 kudos

Thanks for the reference, @Atanu Sarkar .Seems a little odd to me that I'd need to change the internal Databricks Metastore table to add a column expected by the client default Scala client. I'm afraid this could cause issues with other users/jobs ...

1 kudos

03-04-2022 9:28:58 AM

4 More Replies

by kmartin62 • New Contributor III

11-29-2021 4:00:04 AM

6241 Views
9 replies
4 kudos

Resolved! Configure Databricks (spark) context from PyCharm

Hello. I'm trying to connect to Databricks from my IDE (PyCharm) and then run delta table queries from there. However, the cluster I'm trying to access has to give me permission. In this case, I'd go to my cluster, run the cell which gives me permiss...

Data Engineering

6241 Views
9 replies
4 kudos

11-29-2021 4:00:04 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

11-29-2021 4:13:14 AM

4 kudos

"I'm trying to connect to Databricks from my IDE (PyCharm) and then run delta table queries from there."If you are going to deploy later your code to databricks the only solutions which I see is to use databricks-connect or just make development envi...

4 kudos

11-29-2021 4:13:14 AM

8 More Replies