cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results for 
Search instead for 
Did you mean: 

AWS Glue and Databricks

dannylee
New Contributor III

Hello, we're receiving an error when running glue jobs to try and connect to and read from a Databricks SQL endpoint.

Hello, we're receiving an error when running glue jobs to try and connect to and read from a Databricks SQL endpoint.
 
An error occurred while calling o104.load. [Databricks][DatabricksJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Configuration dbtable is not available.:48:47, org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$:hiveOperatingError:HiveThriftServerErrors.scala:65, com.databricks.sql.hive.thriftserver.thrift.ErrorPropagationThriftHandler:runSafely:ErrorPropagationThriftHandler.scala:124, com.databricks.sql.hive.thriftserver.thrift.ErrorPropagationThriftHandler:ExecuteStatement:ErrorPropagationThriftHandler.scala:73, org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:429, org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437, org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422, org.apache.thrift.ProcessFunction:process:ProcessFunction
 
The same read() query with the same options() works fine if I run it in a local pyspark cluster, but its failing in Glue. I suspect it could be related to the GlueContext - has anyone run across this issue or have an idea what might be causing it?

The same read() query with the same options() works fine if I run it in a local pyspark cluster, but its failing in Glue. I suspect it could be related to the GlueContext - has anyone run across this issue or have an idea what might be causing it?

1 ACCEPTED SOLUTION

Accepted Solutions

dannylee
New Contributor III

Hello @Vidula Khanna​ @Debayan Mukherjee​ ,

I wanted to give you an update that might be helpful for your future customers, we worked with @Pavan Kumar Chalamcharla​ and through lots of trial and error we figured out a combination that works for SQL endpoints and dbtable and Glue 4.0.

The combination will not work for query option or for either dbtable or query in Glue 3.0. We were able to successfully connect and execute a dbtable option (as a subquery):

ex: (SELECT 1) as subq

Also, we were able to use the following options as well:

  • partitionColumn
  • lowerBound
  • upperBound
  • numPartitions

However, I'm not 100% confident that its bugfree and working 100%. The job succeeds and data is loaded, but it feels like its questionable whether the partitioning is happening optimally.

Overall, its good progress! Thanks to @Pavan Kumar Chalamcharla​ for getting us the info we needed to iterate thru the different test cases.

View solution in original post

7 REPLIES 7

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi, Using dbtable option to accomplish it is not supported when connecting to dbsql endpoint/warehouse. Could you please share some more context here?

Please let us know if this helps. Also, please tag @Debayan​ with your next comment so that I will get notified. Thank you!

dannylee
New Contributor III

@Debayan Mukherjee​ This helps - it was hard to find information on whether it was supported and the error message "Configuration dbtable is not available" was not returning search results.

We are connecting with AWS Glue (Spark) and trying to pull data from an endpoint, previously we attached to a cluster, which worked fine. We tested using sql.connect() and it worked, but had concerns about the code-rework needed and speed/robustness of using the cursor vs jdbc. Any thoughts?

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi @Danny Lee​ , Unfortunately, Dbtable option is not supported when connecting to dbsql warehouses or endpoints and there is no workaround around it as of now.

Anonymous
Not applicable

Hi @Danny Lee​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

dannylee
New Contributor III

Hi Vidula, still working on this issue. From the enterprise team supporting our source data, we were recommended to try the query keyword with the dbsql endpoint. Not sure if this is working or if the team is aware of the limitations, but still waiting to hear back from the developers.

Thanks for checking in;

dannylee
New Contributor III

Hello @Vidula Khanna​ @Debayan Mukherjee​ ,

I wanted to give you an update that might be helpful for your future customers, we worked with @Pavan Kumar Chalamcharla​ and through lots of trial and error we figured out a combination that works for SQL endpoints and dbtable and Glue 4.0.

The combination will not work for query option or for either dbtable or query in Glue 3.0. We were able to successfully connect and execute a dbtable option (as a subquery):

ex: (SELECT 1) as subq

Also, we were able to use the following options as well:

  • partitionColumn
  • lowerBound
  • upperBound
  • numPartitions

However, I'm not 100% confident that its bugfree and working 100%. The job succeeds and data is loaded, but it feels like its questionable whether the partitioning is happening optimally.

Overall, its good progress! Thanks to @Pavan Kumar Chalamcharla​ for getting us the info we needed to iterate thru the different test cases.

Debayan
Esteemed Contributor III
Esteemed Contributor III

Thanks for the update, we are glad to know it's working.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group