11-30-2022 04:39 PM
Hi everyone,
I've been stuck for the past two days on this issue with my Databricks JDBC driver and I'm hoping someone can give me more insight into how to troubleshoot. I am using the Databricks JDBC driver in RStudio and the connection was working fine until two days ago. There was a companywide Windows update and now cloud fetch is failing and I get the following error message:
Error in .jcall(rp, "I", "fetch", stride, block) :
java.sql.SQLException: [Simba][SparkJDBCDriver](500638) The file <blob URL> has not been downloaded successfully and the driver will not retry due to exceeding of the max retry limit 10, you can increase the max retry limit by setting MaxConsecutiveResultFileDownloadRetries.
The driver however is still working with small datasets <1MB. Has anyone ever encountered this issue and how would I fix it? Thank you in advance for your help.
Kind Regards,
Debbie
12-01-2022 08:04 PM
@Debbie Ng can you try to download 2.6.32 and test please
12-01-2022 08:10 PM
@karthik p Where is v2.6.32 available? The Databricks JDBC Driver page only has v2.6.29 as the latest driver. Thank you for your help!
12-01-2022 08:27 PM
@Debbie Ng please check below maven repo Maven Repository: com.databricks » databricks-jdbc (mvnrepository.com)
12-02-2022 04:08 AM
Hi @Debbie Ng, We haven’t heard from you since the last response from @karthik p, and I was checking back to see if their suggestions helped you.
Or else, If you have any solution, please share it with the community, as it can be helpful to others.
Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.
12-02-2022 06:45 AM
@karthik p No, unfortunately the same issue still occurs with the updated version.
10-10-2023 04:27 AM
Hi @dng , The error you're encountering suggests that the poppler-utils package is not being installed correctly at the system level during the initialization of your cluster.
Here are the steps you should follow:
1. Create a bash script that installs poppler-utils. The script should include the following command: sudo apt-get -f -y install poppler-utils.
2. Save this script to DBFS or another accessible location.
3. When you create your cluster or edit an existing cluster, expand the **Advanced Options** section and click the **Init Scripts** tab.
4. Specify the location of your bash script in the **Init Scripts** field.
5. Confirm the changes and start the cluster. The cluster will execute the init script on startup, installing the poppler-utils package at the system level before the Spark driver or worker JVM starts. If you're still encountering issues after following these steps, it may be worth checking whether the poppler-utils package requires any specific environment variables to be set. If it does, you can also set these within your init script.
01-30-2023 09:09 AM
@Debbie Ng From your message I see there was a windows update and this failure started. based on the conversation you tried latest version of the driver and still you face the problem. I believe this is something related to the Java version compatibility with the latest update.
08-30-2023 02:54 AM
I have this issue today when I was given a new Oracle Linux VM.
I have an existing VM which works 100% but the new VM does not.
Driver version is the same on both VM'**bleep** and Java version is the same.
Python code that is running the query is the same.
openjdk version "1.8.0_382"
OpenJDK Runtime Environment (build 1.8.0_382-b05)
OpenJDK 64-Bit Server VM (build 25.382-b05, mixed mode)
Driver 2.6.27
Also tried 2.6.25, 29, 32 and 33 - same result
I need help with this please ASAP to meet an urgent business need.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group