cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

naveenreddy1
by New Contributor II
  • 18325 Views
  • 4 replies
  • 0 kudos

Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages. Driver stacktrace

We are using the databricks 3 node cluster with 32 GB memory. It is working fine but some times it automatically throwing the error: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues.

  • 18325 Views
  • 4 replies
  • 0 kudos
Latest Reply
RodrigoDe_Freit
New Contributor II
  • 0 kudos

If your job fails follow this:According to https://docs.databricks.com/jobs.html#jar-job-tips: "Job output, such as log output emitted to stdout, is subject to a 20MB size limit. If the total output has a larger size, the run will be canceled and ma...

  • 0 kudos
3 More Replies
vamsivarun007
by New Contributor II
  • 37207 Views
  • 5 replies
  • 2 kudos

Driver is up but is not responsive, likely due to GC.

Hi all, "Driver is up but is not responsive, likely due to GC." This is the message in cluster event logs. Can anyone help me with this. What does GC means? Garbage collection? Can we control it externally?

  • 37207 Views
  • 5 replies
  • 2 kudos
Latest Reply
jacovangelder
Honored Contributor
  • 2 kudos

9/10 times GC is due to out of memory exceptions.@Jaron spark.catalog.clearCache() is not a configurable option, but rather a command to submit.

  • 2 kudos
4 More Replies
brickster_2018
by Databricks Employee
  • 2933 Views
  • 2 replies
  • 0 kudos

Resolved! The driver is temporarily unavailable

My job fails with Driver is temporarily unavailable. Apparently, it's permanently unavailable, because the job is not pausing but failing.

  • 2933 Views
  • 2 replies
  • 0 kudos
Latest Reply
Chalki
New Contributor III
  • 0 kudos

I am facing the same issues .  I am writing in batches using a simple for loop. I don't have any collect statements inside the loop. I am rewriting the partitions with partition overwrite dynamic mode in a huge wide delta table - several tb. The incr...

  • 0 kudos
1 More Replies
MarsSu
by New Contributor II
  • 3180 Views
  • 3 replies
  • 3 kudos

Resolved! Does driver node of job compute have HA?

I would like to confirm and discuss HA mechanism about driver node of job compute. Because we can image driver node just like master node of cluster. In AWS EMR, we can setup 2 master node so that one of master node failed, another master node can re...

  • 3180 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Mars Su​ We haven't heard from you since the last response from @Werner Stinckens​ and @karthik p​ ​, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be...

  • 3 kudos
2 More Replies
testname1
by New Contributor II
  • 2139 Views
  • 1 replies
  • 1 kudos

Is it possible to use the databricks-sql-nodejs driver in a create-react-app app?

I'm using the typescript example for the databricks sql driver but I'm getting errors when compiling:

image.png
  • 2139 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16502773013
Databricks Employee
  • 1 kudos

Hello @asdf fdsa​ ,The NodeJS connector is built for NodeJS environment it will not integrate ReactJSFor cases where a web execution is needed we advise to use SQL Exec APIPlease check documentation here for the same:https://docs.databricks.com/sql/a...

  • 1 kudos
Anonymous
by Not applicable
  • 10020 Views
  • 3 replies
  • 14 kudos

Resolved! No suitable driver error When configure the Databricks ODBC and JDBC drivers

Hi all,I've just encountered with this issue. Before I launched an My SQL database in RDS of AWS after use this simple code to create connection to it but it all fails with this error.Is there any additional step? or could anyone can take a look on i...

Image
  • 10020 Views
  • 3 replies
  • 14 kudos
Latest Reply
Jag
New Contributor III
  • 14 kudos

Hello, It looks issue with JDBC URL. When I am trying to access the Azure SQL database. I was facing the same issue. So I have created JDBC URL as below and it went well.jdbc:sqlserver://<serverurl>:1433;database=<databasename>;user=<username>@<serve...

  • 14 kudos
2 More Replies
jonathan-dufaul
by Valued Contributor
  • 1790 Views
  • 3 replies
  • 3 kudos

Resolved! Why does chaining spark.read from one system/driver and .write to another system/driver take so much longer than doing each piece individually?

i am reading data from IBM DB2 and saving into a MS SQL server (the first step is moving the code itself to databricks, and then we will move the databases to databricks itself). Problem I'm running into is doing something like the below will take > ...

  • 1790 Views
  • 3 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

Hi, it is related to partitioning optimization. By default, the JDBC driver queries the source database with only a single thread. So write was from one partition as one partition was created, so it was using a single core. When you used pandas, it d...

  • 3 kudos
2 More Replies
Ossian
by New Contributor
  • 1986 Views
  • 1 replies
  • 0 kudos

Driver restarts and job dies after 10-20 hours (Structured Streaming)

I am running a java/jar Structured Streaming job on a single node cluster (Databricks runtime 8.3). The job contains a single query which reads records from multiple Azure Event Hubs using Spark Kafka functionality and outputs results to a mssql dat...

  • 1986 Views
  • 1 replies
  • 0 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 0 kudos

its seems that when your nodes are increasing it is seeking for init script and it is failing so you can use reserve instances for this activity instead of spot instances it will increase your overall costor alternatively, you can use depended librar...

  • 0 kudos
sriramkumar
by New Contributor II
  • 1357 Views
  • 2 replies
  • 1 kudos

Reasons for new Databricks driver

What are the reasons behind Databricks going for their own driver? What differences are made when switching between the previous Spark driver and the new Databricks driver?Is there any specific document I can look at or just the release notes?Also, w...

  • 1357 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hey @Sriramkumar Thamizharasan​ Hope all is well! Just wanted to check in if you were able to resolve your issue would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

  • 1 kudos
1 More Replies
GeorgeP
by New Contributor II
  • 1764 Views
  • 2 replies
  • 2 kudos

Errors when querying Azure DataBricks through DBeaver on macos

Configured DBeaver to work with either databricks latest driver or simba. I can connect and see databases, schemas, tables and columns. However, when a select statement is executed 30-40 seconds go by before I get the following error message: SQL...

  • 1764 Views
  • 2 replies
  • 2 kudos
Latest Reply
sage5616
Valued Contributor
  • 2 kudos

Has this issue been resolved? @aravhish solution did not help me. Any other options?I am experiencing the exact same issue with the same configuration on a Mac. Much help would be appreciated.

  • 2 kudos
1 More Replies
abd
by Contributor
  • 6667 Views
  • 7 replies
  • 16 kudos

Resolved! What will happen if a driver or worker node fails?

What will happen if a driver node will fail?What will happen if one of the worker node fails?Is it same in Spark and Databricks or Databricks provide additional features to overcome these situations?

  • 6667 Views
  • 7 replies
  • 16 kudos
Latest Reply
Cedric
Databricks Employee
  • 16 kudos

If the driver node fails your cluster will fail. If the worker node fails, Databricks will spawn a new worker node to replace the failed node and resumes the workload. Generally it is recommended to assign a on-demand instance for your driver and spo...

  • 16 kudos
6 More Replies
knight007
by New Contributor II
  • 3992 Views
  • 7 replies
  • 5 kudos

Containerized Databricks/Spark database

Hello. I'm fairly new to Databricks and Spark.I have a requirement to connect to Databricks using JDBC and that works perfectly using the driver I downloaded from the Databricks website ("com.simba.spark.jdbc.Driver")What I would like to do now is ha...

  • 3992 Views
  • 7 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

@Gurps Bassi​ , "running instance of a database in docker" - that is hive metastore, so it just mapping to data which is usually physically on the data lake. Databricks are so much on the cloud that setting metastore locally doesn't make sense. Inste...

  • 5 kudos
6 More Replies
sh_abrishami_ie
by New Contributor II
  • 4684 Views
  • 1 replies
  • 3 kudos

Resolved! Driver is up but is not responsive, likely due to GC.

Hi,I have a problem with writing an excel file into the mounted file.after 10 mins I see the Driver is up but is not responsive, likely due to GC on the log events.I'm using the following script:df.repartition(1).write .format("com.crealytics.spark....

  • 4684 Views
  • 1 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

It is not solution to that problem but I recommend to handle excel reads and writes with Spark Koalas https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_excel.html just give it a try maybe it will solve your issue

  • 3 kudos
User16826992666
by Valued Contributor
  • 7881 Views
  • 1 replies
  • 1 kudos

Resolved! When should I choose a different driver type on my cluster vs the worker type?

When creating a cluster the driver type defaults to choose the same type as the workers, and this is what I usually choose. But in what of situation would I want to choose a different driver type?

  • 7881 Views
  • 1 replies
  • 1 kudos
Latest Reply
sean_owen
Databricks Employee
  • 1 kudos

Using the same instance type is a fine default. If you know that you need very large workers, but little happens on the driver, maybe you can save money with a smaller driver. Conversely, you may know that some parts of your notebook involve a lot of...

  • 1 kudos
Labels