cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Snowhow1
by New Contributor II
  • 12369 Views
  • 1 replies
  • 1 kudos

Logging when using multiprocessing with joblib

Hi,I'm using joblib for multiprocessing in one of our processes. The logging does work well (except weird py4j errors which I supress) except when it's within multiprocessing. Also how do I supress the other errors that I always receive on DB - perha...

  • 12369 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Sam G​ :It seems like the issue is related to the py4j library used by Spark, and not specifically related to joblib or multiprocessing. The error message indicates a network error while sending a command between the Python process and the Java Virt...

  • 1 kudos
jhon341
by New Contributor
  • 9947 Views
  • 1 replies
  • 1 kudos

How can I optimize Spark performance in Databricks for large-scale data processing

I'm using Databricks for processing large-scale data with Apache Spark, but I'm experiencing performance issues. The processing time is taking longer than expected, and I'm encountering memory and CPU usage limitations. I want to optimize the perform...

  • 9947 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@jhon marton​ :Optimizing Spark performance in Databricks for large-scale data processing can involve a combination of techniques, configurations, and best practices. Below are some recommendations that can help improve the performance of your Spark ...

  • 1 kudos
lugger1
by New Contributor III
  • 4478 Views
  • 1 replies
  • 1 kudos

Resolved! What is the best way to use credentials for API calls from databricks notebook?

Hello, I have an Databricks account on Azure, and the goal is to compare different image tagging services from Azure, GCP, AWS via corresponding API calls, with Python notebook. I have problems with GCP vision API calls, specifically with credentials...

  • 4478 Views
  • 1 replies
  • 1 kudos
Latest Reply
lugger1
New Contributor III
  • 1 kudos

Ok, here is a trick: in my case, the file with GCP credentials is stored in notebook workspace storage, which is not visible to os.environ() command. So solution is to read a content of this file, and save it to the cluster storage attached to the no...

  • 1 kudos
testname1
by New Contributor II
  • 3215 Views
  • 1 replies
  • 1 kudos

Is it possible to use the databricks-sql-nodejs driver in a create-react-app app?

I'm using the typescript example for the databricks sql driver but I'm getting errors when compiling:

image.png
  • 3215 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16502773013
Databricks Employee
  • 1 kudos

Hello @asdf fdsa​ ,The NodeJS connector is built for NodeJS environment it will not integrate ReactJSFor cases where a web execution is needed we advise to use SQL Exec APIPlease check documentation here for the same:https://docs.databricks.com/sql/a...

  • 1 kudos
Diego_MSFT
by New Contributor II
  • 8920 Views
  • 1 replies
  • 4 kudos

Automating the re run of job (with several Tasks) // automate the notification of a failed specific tasks after re trying // Error handling on azure data factory pipeline with DataBricks notebook

Hi DataBricks Experts:I'm using Databricks on Azure.... I'd like to understand the following:1) if there is way of automating the re run some specific failed tasks from a job (with several Tasks), for example if I have 4 tasks, and the task 1 and 2 h...

  • 8920 Views
  • 1 replies
  • 4 kudos
Latest Reply
Lindberg
New Contributor III
  • 4 kudos

You can use "retries".In Workflow, select your job, the task, and in the options below, configure retries.If so, you can also see more options at:https://learn.microsoft.com/pt-br/azure/databricks/dev-tools/api/2.0/jobs?source=recommendations

  • 4 kudos
dceman
by Databricks Partner
  • 1972 Views
  • 1 replies
  • 0 kudos

How to skip "onboarding" wizard?

I have registreded account via AWS marketplace.Also I have deployed workspaces with Terraform.When I log in admin console, It redirects me to https://accounts.cloud.databricks.com/onboardingwhere I need to create workspace manually, but I don't want ...

  • 1972 Views
  • 1 replies
  • 0 kudos
Latest Reply
Mounika_Tarigop
Databricks Employee
  • 0 kudos

Hi Team, Would you mind telling us how you have provisioned? Are you using the same account id which you have used while creation. If so, Could you please try to login through incognito and see if that works?

  • 0 kudos
190809
by Contributor
  • 3971 Views
  • 2 replies
  • 1 kudos

Example API call using 'has_more=true'

Can someone please provide an example while loop including has_more=true. I can't get pagination to work for the API endpoint '/jobs/runs/list/'. Thanks

  • 3971 Views
  • 2 replies
  • 1 kudos
Latest Reply
arpit
Databricks Employee
  • 1 kudos

Hi @Rachel Cunningham​ Could you please elaborate what you mean by "I can't get pagination to work"? Is "has_more" set to "true" even when there are no more tasks to list? This is do you mean it doesn't list all runs or doesn't list tasks within each...

  • 1 kudos
1 More Replies
arun_pamulapati
by Databricks Employee
  • 1880 Views
  • 1 replies
  • 1 kudos

www.youtube.com

We made another major release for Security Analysis Tool (SAT) with Unity Catalog and Delta sharing checks, Terraform deployments, and faster analysis if you have many workspaces. If you are on Azure Databricks there are new step-by-step video-based ...

  • 1880 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Thank you for sharing @Arun Pamulapati​!!!

  • 1 kudos
saikrishna3390
by New Contributor II
  • 1551 Views
  • 1 replies
  • 0 kudos

The current cluster state is pending . please retry your request after 30 seconds

We are trying to make a connection to database instance from datahub/dbeaver and getting error . We can make a connection manually after few tries . We are facing it every time we execute our code to make a connection. We need to resolve this before ...

  • 1551 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

could you share more details? for example, go to the driver's logs and extract the logs and share the error stack trace with us please.

  • 0 kudos
JeroenD
by New Contributor
  • 1609 Views
  • 1 replies
  • 0 kudos

Waiting list

I would like to do the Platform Administrator learning plan, but for all components in the learning plan it mentions "in waiting list". What does this mean?

  • 1609 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

adding @Vidula Khanna​ and @Kaniz Fatma​ for visibility

  • 0 kudos
asif5494
by New Contributor III
  • 2523 Views
  • 1 replies
  • 3 kudos

Study material for Databricks Certified Data Engineer Professional Certification?

I want to go for Databricks Certified Data Engineer Professional, Is there any predefined study material for Databricks Certified Data Engineer Professional Certification?

  • 2523 Views
  • 1 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 3 kudos

adding @Vidula Khanna​ and @Kaniz Fatma​ for visibility

  • 3 kudos
prasadvaze
by Valued Contributor II
  • 20717 Views
  • 1 replies
  • 1 kudos

How to start local/city databricks user group?

Hello Lindsey, I would like to start Richmond, VA databricks user group (chapter) . How do I go about doing this? 

  • 20717 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

adding @Vidula Khanna​ and @Kaniz Fatma​ for visibility

  • 1 kudos
Ogi
by New Contributor II
  • 2648 Views
  • 4 replies
  • 1 kudos

Setting right processingTime

How to set just the right processingTime for readStream to maximize the performance? Based on which factors it depends and is there a way to measure this?

  • 2648 Views
  • 4 replies
  • 1 kudos
Latest Reply
Ogi
New Contributor II
  • 1 kudos

Thanks @Ajay Pandey​ and @Nandini N​ for your answers. I wanted to know more about what should I do in order to do it properly. Should I change processing times (1, 5, 10, 30, 60 seconds) and see how it affects running job in terms of time and CPU/me...

  • 1 kudos
3 More Replies
Anonymous
by Not applicable
  • 2106 Views
  • 1 replies
  • 4 kudos

Hello Everyone, I am thrilled to announce that we have our first winner for the raffle contest - @Uma Maheswara Rao Desula​ Please join me in congratu...

Hello Everyone,I am thrilled to announce that we have our first winner for the raffle contest - @Uma Maheswara Rao Desula​ Please join me in congratulating him on this remarkable achievement!UmaMahesh, your dedication and hard work have paid off, and...

Winner1
  • 2106 Views
  • 1 replies
  • 4 kudos
Latest Reply
Sujitha
Databricks Employee
  • 4 kudos

@Uma Maheswara Rao Desula​  Congratulations on this well deserved win!! Can't wait for you to meet our Community peers at the Data + AI Summit 2023 in SFO.

  • 4 kudos
Labels