cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot connect to SQL Warehouse using JDBC connector in Spark

JustinM
New Contributor II

When trying to connect to a SQL warehouse using the JDBC connector with Spark the below error is thrown. Note that connecting directly to a cluster with similar connection parameters works without issue, the error only occurs with SQL Warehouses.

py4j.protocol.Py4JJavaError: An error occurred while calling o327.run. : java.sql.SQLException: [Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: null, Query: SELECT * FROM `default`.`user_cv_table` WHERE 1=0, Error message from Server: Configuration dbtable is not available..
at com.databricks.client.hivecommon.api.HS2Client.buildExceptionFromTStatusSqlState(Unknown Source)
at com.databricks.client.hivecommon.api.HS2Client.buildExceptionFromTStatus(Unknown Source)
at com.databricks.client.hivecommon.api.HS2Client.checkResponseStatus(Unknown Source)
at com.databricks.client.hivecommon.api.HS2Client.executeStatementInternal(Unknown Source)
at com.databricks.client.hivecommon.api.HS2Client.executeStatement(Unknown Source)
at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeNonRowCountQueryHelper(Unknown Source)
at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeQuery(Unknown Source)
at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.<init>(Unknown Source)
at com.databricks.client.hivecommon.dataengine.HiveJDBCDataEngine.prepare(Unknown Source)
at com.databricks.client.jdbc.common.SPreparedStatement.<init>(Unknown Source)
at com.databricks.client.jdbc.jdbc41.S41PreparedStatement.<init>(Unknown Source)
at com.databricks.client.jdbc.jdbc42.S42PreparedStatement.<init>(Unknown Source)
at com.databricks.client.hivecommon.jdbc42.Hive42PreparedStatement.<init>(Unknown Source)
at com.databricks.client.spark.jdbc.SparkJDBCObjectFactory.createPreparedStatement(Unknown Source)
at com.databricks.client.jdbc.common.JDBCObjectFactory.newPreparedStatement(Unknown Source)
at com.databricks.client.jdbc.common.SConnection$5.create(Unknown Source)
at com.databricks.client.jdbc.common.SConnection$5.create(Unknown Source)
at com.databricks.client.jdbc.common.SConnection$StatementCreator.create(Unknown Source)
at com.databricks.client.jdbc.common.SConnection.prepareStatement(Unknown Source)
at com.databricks.client.jdbc.common.SConnection.prepareStatement(Unknown Source)
at com.databricks.client.jdbc.common.SConnection.prepareStatement(Unknown Source)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.getQueryOutputSchema(JDBCRDD.scala:64)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:57)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:239)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:36)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174)

4 REPLIES 4

Kaniz
Community Manager
Community Manager

Hi @JustinM, Check your configuration settings: Ensure that the dbtable configuration is correctly set in your Spark code. The dbtable option should specify the table you want to load from your SQL warehouse.

 

Update JDBC driver: Make sure you’re using the latest JDBC driver compatible with your SQL warehouse....

 

Use ODBC: Some users have reported success by installing the ODBC while keeping the JDBC jar in the ....

 

Check your SQL warehouse: Ensure that the table user_cv_table exists in your SQL warehouse and that ....

JustinM
New Contributor II

Hi Kaniz, thanks for your reply.

I'm fairly certain the dbtable configuration is set correctly. The stack trace states that this query is being used:

SELECT * FROM `default`.`user_cv_table` WHERE 1=0

However I'm not sure where that is coming from. I'm not submitting that query, wherever that query is being constructed seems to be dropping the dbtable configuration field. My best guess is that it is coming from this line:
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.getQueryOutputSchema(JDBCRDD.scala:64)

Note that my error is the same as previous question here:
https://community.databricks.com/t5/warehousing-analytics/aws-glue-and-databricks/td-p/4679

@Debayan you mention there that the dbtable is not supported, this seems to be the same situation here?

akhileshp
New Contributor III

I am also facing the same issue, while writing to hive_metastore table. In notebook spark.sql and saveAsTable work but in the spark job nothing seems to be working with the SQL warehouse

jmms
New Contributor

Same error here, I am trying to save spark dataframe to Delta lake using JDBC driver and pyspark using this code:

#Spark session
spark_session = SparkSession.builder \
          .appName("RCT-API") \
          .config("spark.metrics.namespace", "rct-api") \
          .config("spark.driver.extraClassPath", "/opt/workspace/databricks-jdbc-2.6.34-sources.jar") \
          .config("spark.driver.extraLibrary", "/opt/workspace/databricks-jdbc-2.6.34-sources.jar") \
          .config("spark.sql.execution.arrow.pyspark.enabled", "true") \
          .config("spark.jars.packages", "com.databricks:databricks-jdbc:2.6.36") \
          .getOrCreate()

#Loading spark dataframe from pandas dataframe
spark_df = spark_session.createDataFrame(pandas_df)

# Saving design table to DeltaLake
spark_df.write.format("jdbc").options(
                url= connection_string,
                driver= "com.databricks.client.jdbc.Driver",
                dbtable= table_name).mode('overwrite').save()

 I am facing the following error: 

Caused by: com.databricks.client.support.exceptions.ErrorException: [Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: 42601, Query: CREATE TAB***, Error message from Server: org.apache.hive.service.cli.HiveSQLException: Error running query: [PARSE_SYNTAX_ERROR] org.apache.spark.sql.catalyst.parser.ParseException:
[PARSE_SYNTAX_ERROR] Syntax error at or near '"id"'. SQLSTATE: 42601 (line 1, pos 33)


Any idea? This configuration worked with general purpose clusters, but not SQL warehouses

 

 

 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.