How to register a JDBC Spark dialect in Python?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 01:13 PM
I am trying to read from a databricks table. I have used the url from a cluster in the databricks. I am getting this error:
java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.After these statements:
jdbcConnUrl= "jdbc:spark://adb....."
testquery="(select * from db.table limit 3)"
testdf=spark.read.format("jdbc").option("url", jdbcConnUrl).option("dbtable", testquery).option("fetchsize", "10000").load()
testdf.show()I have come across all Scala solutions for this issue but I am using python. I want a python equivalent of this code:
import org.apache.spark.sql.jdbc.{JdbcDialect, JdbcDialects}
JdbcDialects.registerDialect(new JdbcDialect() {
override def canHandle(url: String): Boolean = url.toLowerCase.startsWith("jdbc:spark:")
override
def quoteIdentifier(column: String): String = column
})
- Labels:
-
Databricks table
-
Python
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-22-2022 10:47 PM
Hi @Brad Powell, Were you able to solve this issue in Python. I'm also struck with this issue in python, solution is available only in Scala and need this solution in python. Could you please help me in solving this issue. Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2022 06:30 AM
It was helpfull thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-30-2022 10:13 AM
Hi @Brad Powell / @Kaniz Fatma / @S Meghala - Any update on this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2023 03:20 AM
is there a solution for this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-30-2023 08:37 AM - edited 06-30-2023 11:45 AM
I was able to solve this
- Add this code into a simple scala class object method
- package it into a JAR file
- now install this JAR file on the cluster where you execute the JDBC code.
- add the below line of code before executing the JDBC code in you pyspark code. this will execute that class and method from scala class in your JVM directly.spark.sparkContext._jvm.<scalaclass fully qualified>.<method>