I am facing the same issues . I am writing in batches using a simple for loop. I don't have any collect statements inside the loop. I am rewriting the partitions with partition overwrite dynamic mode in a huge wide delta table - several tb. The incr...
I would like to confirm and discuss HA mechanism about driver node of job compute. Because we can image driver node just like master node of cluster. In AWS EMR, we can setup 2 master node so that one of master node failed, another master node can re...
Hi @Mars Su We haven't heard from you since the last response from @Werner Stinckens and @karthik p , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be...
Hello @asdf fdsa ,The NodeJS connector is built for NodeJS environment it will not integrate ReactJSFor cases where a web execution is needed we advise to use SQL Exec APIPlease check documentation here for the same:https://docs.databricks.com/sql/a...
Hi all,I've just encountered with this issue. Before I launched an My SQL database in RDS of AWS after use this simple code to create connection to it but it all fails with this error.Is there any additional step? or could anyone can take a look on i...
Hello, It looks issue with JDBC URL. When I am trying to access the Azure SQL database. I was facing the same issue. So I have created JDBC URL as below and it went well.jdbc:sqlserver://<serverurl>:1433;database=<databasename>;user=<username>@<serve...
i am reading data from IBM DB2 and saving into a MS SQL server (the first step is moving the code itself to databricks, and then we will move the databases to databricks itself). Problem I'm running into is doing something like the below will take > ...
Hi, it is related to partitioning optimization. By default, the JDBC driver queries the source database with only a single thread. So write was from one partition as one partition was created, so it was using a single core. When you used pandas, it d...
I am running a java/jar Structured Streaming job on a single node cluster (Databricks runtime 8.3). The job contains a single query which reads records from multiple Azure Event Hubs using Spark Kafka functionality and outputs results to a mssql dat...
its seems that when your nodes are increasing it is seeking for init script and it is failing so you can use reserve instances for this activity instead of spot instances it will increase your overall costor alternatively, you can use depended librar...
What are the reasons behind Databricks going for their own driver? What differences are made when switching between the previous Spark driver and the new Databricks driver?Is there any specific document I can look at or just the release notes?Also, w...
Hey @Sriramkumar Thamizharasan Hope all is well! Just wanted to check in if you were able to resolve your issue would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...
Configured DBeaver to work with either databricks latest driver or simba. I can connect and see databases, schemas, tables and columns. However, when a select statement is executed 30-40 seconds go by before I get the following error message: SQL...
Has this issue been resolved? @aravhish solution did not help me. Any other options?I am experiencing the exact same issue with the same configuration on a Mac. Much help would be appreciated.
What will happen if a driver node will fail?What will happen if one of the worker node fails?Is it same in Spark and Databricks or Databricks provide additional features to overcome these situations?
Hi @Abdullah Durrani, I'm glad to see that the suggestions provided here helped you. Well, in that case, would you please help us select the best answer for the community?
Hello. I'm fairly new to Databricks and Spark.I have a requirement to connect to Databricks using JDBC and that works perfectly using the driver I downloaded from the Databricks website ("com.simba.spark.jdbc.Driver")What I would like to do now is ha...
Hi @Gurps Bassi , Just a friendly follow-up. Do you still need help, or do @Hubert Dudek (Customer) and @Werner Stinckens 's responses help you find the solution? Please let us know.
Hi,I have a problem with writing an excel file into the mounted file.after 10 mins I see the Driver is up but is not responsive, likely due to GC on the log events.I'm using the following script:df.repartition(1).write .format("com.crealytics.spark....
When creating a cluster the driver type defaults to choose the same type as the workers, and this is what I usually choose. But in what of situation would I want to choose a different driver type?
Using the same instance type is a fine default. If you know that you need very large workers, but little happens on the driver, maybe you can save money with a smaller driver. Conversely, you may know that some parts of your notebook involve a lot of...
We can now launch pools on databricks with different instance types. Hybrid Pools allows customers to create clusters and select different Databricks pools for driver and workers. It provides a way to support driver vs. worker heterogeneity, and ther...
Hi all,
"Driver is up but is not responsive, likely due to GC."
This is the message in cluster event logs. Can anyone help me with this. What does GC means? Garbage collection? Can we control it externally?