cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Oliver_Angelil
by Valued Contributor II
  • 8771 Views
  • 9 replies
  • 1 kudos

How to use the git CLI in databricks?

After making some changes in my feature branch, I have committed and pushed (to Azure Devops) some work (note I have not yet raised a PR or merge to any other branch). Many of the files I committed are data files and so I would like to reverse the co...

  • 8771 Views
  • 9 replies
  • 1 kudos
Latest Reply
AntonDBUser
New Contributor II
  • 1 kudos

Any updates on this? We still can't manage to run Git CLI commands from Databricks. Appreciate any input on this!

  • 1 kudos
8 More Replies
dg
by New Contributor II
  • 16444 Views
  • 7 replies
  • 3 kudos

Trying to use pdf2image on databricks

Trying to use pdf2image on databricks, but its failing with "PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?"I've installed pdf2image & poppler-utils by running the following in a cell:%pip install pdf2image%pip ...

  • 16444 Views
  • 7 replies
  • 3 kudos
Latest Reply
Slalom_Tobias
New Contributor III
  • 3 kudos

Seems like this thread has died, but for posterity, databricks provides the following code for installing poppler on a cluster. The code is sourced from the dbdemos accelerators, specifically the "LLM Chatbot With Retrieval Augmented Generation (RAG)...

  • 3 kudos
6 More Replies
houstonamoeba
by New Contributor III
  • 3960 Views
  • 7 replies
  • 1 kudos

Resolved! examples on python sdk for install libraries

Hi Everyone,I'm planning to use databricks python cli "install_libraries"can some one pls post examples on function install_libraries https://github.com/databricks/databricks-cli/blob/main/databricks_cli/libraries/api.py

  • 3960 Views
  • 7 replies
  • 1 kudos
Latest Reply
Loop-Insist
New Contributor II
  • 1 kudos

Here you go using Python SDKfrom databricks.sdk import WorkspaceClientfrom databricks.sdk.service import computew = WorkspaceClient(host="yourhost", token="yourtoken")# Create an array of Library objects to be installedlibraries_to_install = [compute...

  • 1 kudos
6 More Replies
superanna
by New Contributor II
  • 871 Views
  • 1 replies
  • 1 kudos

Yes, still illegal. And I also don’t understand why it is equated with drugs, but alcohol is not! Not a single murder has yet been committed under can...

Yes, still illegal. And I also don’t understand why it is equated with drugs, but alcohol is not! Not a single murder has yet been committed under cannabis, not a single war has been unleashed. It's just that people who don't use don't understand how...

  • 871 Views
  • 1 replies
  • 1 kudos
Latest Reply
Mz_Yvette
New Contributor II
  • 1 kudos

You are absolutely right! I have found it to be a big relief medically. I have nerve conditions which is not operable. The legal medical pills almost literally killed me, and if it wasn't for my husband's quick thinking, I wouldn't be here to share t...

  • 1 kudos
THIAM_HUATTAN
by Valued Contributor
  • 1146 Views
  • 1 replies
  • 0 kudos

community.cloud.databricks.com

https://community.cloud.databricks.com/Last night I am still able to use it. This morning breaks down totally, and I could not login. Please help.

  • 1146 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @THIAM HUAT TAN​ Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not res...

  • 0 kudos
Abel_Martinez
by Contributor
  • 15681 Views
  • 10 replies
  • 38 kudos

Why Python logs shows [REDACTED] literal in spaces when I use dbutils.secrets.get in my code?

When I use  dbutils.secrets.get in my code, spaces in the log are replaced by "[REDACTED]" literal. This is very annoying and makes the log reading difficult. Any idea how to avoid this?See my screenshot...

  • 15681 Views
  • 10 replies
  • 38 kudos
Latest Reply
jlb0001
New Contributor III
  • 38 kudos

I ran into the same issue and found that the reason was that the notebook included some test keys with values of "A" and "B" for simple testing. I noticed that any string with a substring of "A" or "B" was "[REDACTED]".​So, in my case, it was an eas...

  • 38 kudos
9 More Replies
testname1
by New Contributor II
  • 1898 Views
  • 1 replies
  • 1 kudos

Is it possible to use the databricks-sql-nodejs driver in a create-react-app app?

I'm using the typescript example for the databricks sql driver but I'm getting errors when compiling:

image.png
  • 1898 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16502773013
Databricks Employee
  • 1 kudos

Hello @asdf fdsa​ ,The NodeJS connector is built for NodeJS environment it will not integrate ReactJSFor cases where a web execution is needed we advise to use SQL Exec APIPlease check documentation here for the same:https://docs.databricks.com/sql/a...

  • 1 kudos
alejandrofm
by Valued Contributor
  • 2870 Views
  • 2 replies
  • 1 kudos

Understand if the configs I use to SparkSession.builder still make sense for Databricks 10+

Hi! I currently have this as an old generic template with amends over time to optimize Databricks Spark execution, can you help me to know if this still makes sense for v10-11-12 or if there are new recommendations? Maybe some of this is making my pr...

  • 2870 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Alejandro Martinez​ :Hi! Your template seems to be a good starting point for configuring a SparkSession in Databricks. However, there are some new recommendations that you can consider for Databricks runtime versions v10-11-12. Here are some suggest...

  • 1 kudos
1 More Replies
KellenO
by New Contributor II
  • 2114 Views
  • 2 replies
  • 8 kudos

Resolved! How can I use cluster autoscaling with intensive subprocess calls?

I have a custom application/executable that I upload to DBFS and transfer to my cluster's local storage for execution. I want to call multiple instances of this application in parallel, which I've only been able to successfully do with Python's subpr...

  • 2114 Views
  • 2 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Autoscaling works for spark jobs only. It works by monitoring the job queue, which python code won't go into. If it's just python code, try single node.https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling

  • 8 kudos
1 More Replies
Trey
by New Contributor III
  • 1829 Views
  • 2 replies
  • 3 kudos

Where do you usually store and manage "JDBC credentials" to use on databricks notebook?

Hi all,I would like to improve the way I use JDBC credenditial information (ID/PW, host, port, etc)Where do you guys usually store and use the jdbc credentials?Thanks for your help in advance!

  • 1829 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Kwangwon Yi​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 3 kudos
1 More Replies
109005
by New Contributor III
  • 2605 Views
  • 6 replies
  • 0 kudos

Does Databricks support proxy for BigQuery?

Hi team, we tried to use the proxy options for BigQuery Spark connector as mentioned in this documentation. However, we keep getting "connect timed out" error. The proxy host is working on our end. This made us wonder if by chance Databricks does not...

  • 2605 Views
  • 6 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Ayushi Pandey​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 0 kudos
5 More Replies
BradSheridan
by Valued Contributor
  • 3999 Views
  • 9 replies
  • 4 kudos

Resolved! How to use cloudFiles to completely overwrite the target

Hey there Community!! I have a client that will produce a CSV file daily that needs to be moved from Bronze -> Silver. Unfortunately, this source file will always be a full set of data....not incremental. I was thinking of using AutoLoader/cloudFil...

  • 3999 Views
  • 9 replies
  • 4 kudos
Latest Reply
BradSheridan
Valued Contributor
  • 4 kudos

I "up voted'" all of @werners suggestions b/c they are all very valid ways of addressing my need (the true power/flexibility of the Databricks UDAP!!!). However, turns out I'm going to end up getting incremental data afterall :). So now the flow wi...

  • 4 kudos
8 More Replies
lecardozo
by New Contributor II
  • 4689 Views
  • 5 replies
  • 1 kudos

Resolved! Problems with HiveMetastoreClient and internal Databricks Metastore.

I've been trying to use ​the HiveMetastoreClient class in Scala to extract some metadata from Databricks internal Metastore, without success. I'm currently using the 7.3 LTS runtime.​The error seems to be related to some kind of inconsistency between...

  • 4689 Views
  • 5 replies
  • 1 kudos
Latest Reply
lecardozo
New Contributor II
  • 1 kudos

Thanks for the reference, @Atanu Sarkar​ .​Seems a little odd to me that I'd need to change the internal Databricks Metastore table to add a column expected by the client default Scala client. I'm afraid this could cause issues with other users/jobs ...

  • 1 kudos
4 More Replies
kmartin62
by New Contributor III
  • 5435 Views
  • 9 replies
  • 4 kudos

Resolved! Configure Databricks (spark) context from PyCharm

Hello. I'm trying to connect to Databricks from my IDE (PyCharm) and then run delta table queries from there. However, the cluster I'm trying to access has to give me permission. In this case, I'd go to my cluster, run the cell which gives me permiss...

  • 5435 Views
  • 9 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

"I'm trying to connect to Databricks from my IDE (PyCharm) and then run delta table queries from there."If you are going to deploy later your code to databricks the only solutions which I see is to use databricks-connect or just make development envi...

  • 4 kudos
8 More Replies
Labels