cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Raja_682227
by New Contributor II
  • 906 Views
  • 2 replies
  • 2 kudos

Databricks Data Cleanroom

Just needs to understand the data cleanroom. As per the documentation, Databricks Data Cleanroom provides a secure, governed, and privacy-safe environment. Participants can enable fine-grained control access to data with the help of UnityCatalog.Also...

  • 906 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Rajarampandian Arumugam​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear fro...

  • 2 kudos
1 More Replies
Snowhow1
by New Contributor II
  • 2918 Views
  • 1 replies
  • 1 kudos

Logging when using multiprocessing with joblib

Hi,I'm using joblib for multiprocessing in one of our processes. The logging does work well (except weird py4j errors which I supress) except when it's within multiprocessing. Also how do I supress the other errors that I always receive on DB - perha...

  • 2918 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Sam G​ :It seems like the issue is related to the py4j library used by Spark, and not specifically related to joblib or multiprocessing. The error message indicates a network error while sending a command between the Python process and the Java Virt...

  • 1 kudos
jhon341
by New Contributor
  • 2504 Views
  • 1 replies
  • 0 kudos

How can I optimize Spark performance in Databricks for large-scale data processing

I'm using Databricks for processing large-scale data with Apache Spark, but I'm experiencing performance issues. The processing time is taking longer than expected, and I'm encountering memory and CPU usage limitations. I want to optimize the perform...

  • 2504 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@jhon marton​ :Optimizing Spark performance in Databricks for large-scale data processing can involve a combination of techniques, configurations, and best practices. Below are some recommendations that can help improve the performance of your Spark ...

  • 0 kudos
jtorr
by New Contributor
  • 1059 Views
  • 1 replies
  • 0 kudos

What are executeAdhocQuery and executeFastQuery operations in the Azure SQL Logs?

Hi,-Im performing some analysis using the databricks sql logs, and seeing these operation names.-I notice these events dont seem to have a duration nor query text, unlike commandSubmit operations.-Any explanation on what these operations mean exactly...

  • 1059 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Jose Torres​, executeAdhocQuery and executeFastQuery are two types of operations that can appear in the Azure SQL Logs.executeAdhocQuery refers to the execution of an ad hoc query, which is a one-time query that is not stored as a prepared statem...

  • 0 kudos
lugger1
by New Contributor III
  • 1537 Views
  • 1 replies
  • 1 kudos

Resolved! What is the best way to use credentials for API calls from databricks notebook?

Hello, I have an Databricks account on Azure, and the goal is to compare different image tagging services from Azure, GCP, AWS via corresponding API calls, with Python notebook. I have problems with GCP vision API calls, specifically with credentials...

  • 1537 Views
  • 1 replies
  • 1 kudos
Latest Reply
lugger1
New Contributor III
  • 1 kudos

Ok, here is a trick: in my case, the file with GCP credentials is stored in notebook workspace storage, which is not visible to os.environ() command. So solution is to read a content of this file, and save it to the cluster storage attached to the no...

  • 1 kudos
testname1
by New Contributor II
  • 996 Views
  • 1 replies
  • 1 kudos

Is it possible to use the databricks-sql-nodejs driver in a create-react-app app?

I'm using the typescript example for the databricks sql driver but I'm getting errors when compiling:

image.png
  • 996 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16502773013
New Contributor III
  • 1 kudos

Hello @asdf fdsa​ ,The NodeJS connector is built for NodeJS environment it will not integrate ReactJSFor cases where a web execution is needed we advise to use SQL Exec APIPlease check documentation here for the same:https://docs.databricks.com/sql/a...

  • 1 kudos
Diego_MSFT
by New Contributor II
  • 2444 Views
  • 1 replies
  • 4 kudos

Automating the re run of job (with several Tasks) // automate the notification of a failed specific tasks after re trying // Error handling on azure data factory pipeline with DataBricks notebook

Hi DataBricks Experts:I'm using Databricks on Azure.... I'd like to understand the following:1) if there is way of automating the re run some specific failed tasks from a job (with several Tasks), for example if I have 4 tasks, and the task 1 and 2 h...

  • 2444 Views
  • 1 replies
  • 4 kudos
Latest Reply
Lindberg
New Contributor II
  • 4 kudos

You can use "retries".In Workflow, select your job, the task, and in the options below, configure retries.If so, you can also see more options at:https://learn.microsoft.com/pt-br/azure/databricks/dev-tools/api/2.0/jobs?source=recommendations

  • 4 kudos
dceman
by New Contributor
  • 578 Views
  • 1 replies
  • 0 kudos

How to skip "onboarding" wizard?

I have registreded account via AWS marketplace.Also I have deployed workspaces with Terraform.When I log in admin console, It redirects me to https://accounts.cloud.databricks.com/onboardingwhere I need to create workspace manually, but I don't want ...

  • 578 Views
  • 1 replies
  • 0 kudos
Latest Reply
Mounika_Tarigop
New Contributor II
  • 0 kudos

Hi Team, Would you mind telling us how you have provisioned? Are you using the same account id which you have used while creation. If so, Could you please try to login through incognito and see if that works?

  • 0 kudos
190809
by Contributor
  • 924 Views
  • 2 replies
  • 1 kudos

Example API call using 'has_more=true'

Can someone please provide an example while loop including has_more=true. I can't get pagination to work for the API endpoint '/jobs/runs/list/'. Thanks

  • 924 Views
  • 2 replies
  • 1 kudos
Latest Reply
arpit
Contributor III
  • 1 kudos

Hi @Rachel Cunningham​ Could you please elaborate what you mean by "I can't get pagination to work"? Is "has_more" set to "true" even when there are no more tasks to list? This is do you mean it doesn't list all runs or doesn't list tasks within each...

  • 1 kudos
1 More Replies
arun_pamulapati
by New Contributor III
  • 458 Views
  • 1 replies
  • 1 kudos

www.youtube.com

We made another major release for Security Analysis Tool (SAT) with Unity Catalog and Delta sharing checks, Terraform deployments, and faster analysis if you have many workspaces. If you are on Azure Databricks there are new step-by-step video-based ...

  • 458 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Moderator
  • 1 kudos

Thank you for sharing @Arun Pamulapati​!!!

  • 1 kudos
saikrishna3390
by New Contributor II
  • 491 Views
  • 1 replies
  • 0 kudos

The current cluster state is pending . please retry your request after 30 seconds

We are trying to make a connection to database instance from datahub/dbeaver and getting error . We can make a connection manually after few tries . We are facing it every time we execute our code to make a connection. We need to resolve this before ...

  • 491 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

could you share more details? for example, go to the driver's logs and extract the logs and share the error stack trace with us please.

  • 0 kudos
JeroenD
by New Contributor
  • 549 Views
  • 1 replies
  • 0 kudos

Waiting list

I would like to do the Platform Administrator learning plan, but for all components in the learning plan it mentions "in waiting list". What does this mean?

  • 549 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

adding @Vidula Khanna​ and @Kaniz Fatma​ for visibility

  • 0 kudos
asif5494
by New Contributor III
  • 617 Views
  • 1 replies
  • 3 kudos

Study material for Databricks Certified Data Engineer Professional Certification?

I want to go for Databricks Certified Data Engineer Professional, Is there any predefined study material for Databricks Certified Data Engineer Professional Certification?

  • 617 Views
  • 1 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Moderator
  • 3 kudos

adding @Vidula Khanna​ and @Kaniz Fatma​ for visibility

  • 3 kudos
Ogi
by New Contributor II
  • 686 Views
  • 4 replies
  • 1 kudos

Setting right processingTime

How to set just the right processingTime for readStream to maximize the performance? Based on which factors it depends and is there a way to measure this?

  • 686 Views
  • 4 replies
  • 1 kudos
Latest Reply
Ogi
New Contributor II
  • 1 kudos

Thanks @Ajay Pandey​ and @Nandini N​ for your answers. I wanted to know more about what should I do in order to do it properly. Should I change processing times (1, 5, 10, 30, 60 seconds) and see how it affects running job in terms of time and CPU/me...

  • 1 kudos
3 More Replies
AdamRink
by New Contributor III
  • 1008 Views
  • 3 replies
  • 0 kudos

Resolved! Apply Avro defaults when writing to Confluent Kafka

I have an avro schema for my Kafka topic. In that schema it has defaults. I would like to exclude the defaulted columns from databricks and just let them default as an empty array. Sample avro, trying to not provide the UserFields because I can't...

  • 1008 Views
  • 3 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Adam Rink​ , Please go through the following blog. Let me know if it helps.https://docs.databricks.com/spark/latest/structured-streaming/avro-dataframe.html#example-with-schema-registry

  • 0 kudos
2 More Replies
Labels
Top Kudoed Authors