01-08-2024 10:49 PM
When trying to connect to a SQL warehouse using the JDBC connector with Spark the below error is thrown. Note that connecting directly to a cluster with similar connection parameters works without issue, the error only occurs with SQL Warehouses.
py4j.protocol.Py4JJavaError: An error occurred while calling o327.run. : java.sql.SQLException: [Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: null, Query: SELECT * FROM `default`.`user_cv_table` WHERE 1=0, Error message from Server: Configuration dbtable is not available..
at com.databricks.client.hivecommon.api.HS2Client.buildExceptionFromTStatusSqlState(Unknown Source)
at com.databricks.client.hivecommon.api.HS2Client.buildExceptionFromTStatus(Unknown Source)
at com.databricks.client.hivecommon.api.HS2Client.checkResponseStatus(Unknown Source)
at com.databricks.client.hivecommon.api.HS2Client.executeStatementInternal(Unknown Source)
at com.databricks.client.hivecommon.api.HS2Client.executeStatement(Unknown Source)
at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeNonRowCountQueryHelper(Unknown Source)
at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeQuery(Unknown Source)
at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.<init>(Unknown Source)
at com.databricks.client.hivecommon.dataengine.HiveJDBCDataEngine.prepare(Unknown Source)
at com.databricks.client.jdbc.common.SPreparedStatement.<init>(Unknown Source)
at com.databricks.client.jdbc.jdbc41.S41PreparedStatement.<init>(Unknown Source)
at com.databricks.client.jdbc.jdbc42.S42PreparedStatement.<init>(Unknown Source)
at com.databricks.client.hivecommon.jdbc42.Hive42PreparedStatement.<init>(Unknown Source)
at com.databricks.client.spark.jdbc.SparkJDBCObjectFactory.createPreparedStatement(Unknown Source)
at com.databricks.client.jdbc.common.JDBCObjectFactory.newPreparedStatement(Unknown Source)
at com.databricks.client.jdbc.common.SConnection$5.create(Unknown Source)
at com.databricks.client.jdbc.common.SConnection$5.create(Unknown Source)
at com.databricks.client.jdbc.common.SConnection$StatementCreator.create(Unknown Source)
at com.databricks.client.jdbc.common.SConnection.prepareStatement(Unknown Source)
at com.databricks.client.jdbc.common.SConnection.prepareStatement(Unknown Source)
at com.databricks.client.jdbc.common.SConnection.prepareStatement(Unknown Source)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.getQueryOutputSchema(JDBCRDD.scala:64)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:57)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:239)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:36)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174)
02-05-2024 06:27 PM
Hi Kaniz, thanks for your reply.
I'm fairly certain the dbtable configuration is set correctly. The stack trace states that this query is being used:
SELECT * FROM `default`.`user_cv_table` WHERE 1=0
However I'm not sure where that is coming from. I'm not submitting that query, wherever that query is being constructed seems to be dropping the dbtable configuration field. My best guess is that it is coming from this line:
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.getQueryOutputSchema(JDBCRDD.scala:64)
Note that my error is the same as previous question here:
https://community.databricks.com/t5/warehousing-analytics/aws-glue-and-databricks/td-p/4679
@Debayan you mention there that the dbtable is not supported, this seems to be the same situation here?
03-08-2024 12:29 AM
I am also facing the same issue, while writing to hive_metastore table. In notebook spark.sql and saveAsTable work but in the spark job nothing seems to be working with the SQL warehouse
04-23-2024 11:58 AM
Same error here, I am trying to save spark dataframe to Delta lake using JDBC driver and pyspark using this code:
#Spark session
spark_session = SparkSession.builder \
.appName("RCT-API") \
.config("spark.metrics.namespace", "rct-api") \
.config("spark.driver.extraClassPath", "/opt/workspace/databricks-jdbc-2.6.34-sources.jar") \
.config("spark.driver.extraLibrary", "/opt/workspace/databricks-jdbc-2.6.34-sources.jar") \
.config("spark.sql.execution.arrow.pyspark.enabled", "true") \
.config("spark.jars.packages", "com.databricks:databricks-jdbc:2.6.36") \
.getOrCreate()
#Loading spark dataframe from pandas dataframe
spark_df = spark_session.createDataFrame(pandas_df)
# Saving design table to DeltaLake
spark_df.write.format("jdbc").options(
url= connection_string,
driver= "com.databricks.client.jdbc.Driver",
dbtable= table_name).mode('overwrite').save()
I am facing the following error:
Caused by: com.databricks.client.support.exceptions.ErrorException: [Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: 42601, Query: CREATE TAB***, Error message from Server: org.apache.hive.service.cli.HiveSQLException: Error running query: [PARSE_SYNTAX_ERROR] org.apache.spark.sql.catalyst.parser.ParseException:
[PARSE_SYNTAX_ERROR] Syntax error at or near '"id"'. SQLSTATE: 42601 (line 1, pos 33)
Any idea? This configuration worked with general purpose clusters, but not SQL warehouses
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group