Re: How to register a JDBC Spark dialect in Python...

User16765131552 · ‎06-18-2021

I am trying to read from a databricks table. I have used the url from a cluster in the databricks. I am getting this error:

 java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.

After these statements:

jdbcConnUrl= "jdbc:spark://adb....."
testquery="(select * from db.table limit 3)"
testdf=spark.read.format("jdbc").option("url", jdbcConnUrl).option("dbtable", testquery).option("fetchsize", "10000").load()
testdf.show()

I have come across all Scala solutions for this issue but I am using python. I want a python equivalent of this code:

import org.apache.spark.sql.jdbc.{JdbcDialect, JdbcDialects}
JdbcDialects.registerDialect(new JdbcDialect() {
override def canHandle(url: String): Boolean = url.toLowerCase.startsWith("jdbc:spark:")
override
def quoteIdentifier(column: String): String = column
})

imstwz1 · ‎12-22-2022

Hi @Brad Powell, Were you able to solve this issue in Python. I'm also struck with this issue in python, solution is available only in Scala and need this solution in python. Could you please help me in solving this issue. Thanks

Meghala · ‎12-26-2022

It was helpfull thank you

Yadu · ‎12-30-2022

Hi @Brad Powell / @Kaniz Fatma / @S Meghala - Any update on this?

KKDataEngineer · ‎06-28-2023

is there a solution for this?

KKDataEngineer · ‎06-30-2023

@Retired_mod

I was able to solve this

Add this code into a simple scala class object method
package it into a JAR file
now install this JAR file on the cluster where you execute the JDBC code.
add the below line of code before executing the JDBC code in you pyspark code. this will execute that class and method from scala class in your JVM directly.
spark.sparkContext._jvm.<scalaclass fully qualified>.<method>

@User16765131552 @Yadu @imstwz1

How to register a JDBC Spark dialect in Python?