Hello, thank you for your question.
The initial metadata query (the "Listing Columns" query) is tied to the SparkGetColumnsOperation class, which is part of the Apache Hive ThriftServer and Spark's handling of JDBC metadata operations.
Can you please confirm the SELECT COUNT LIMIT statement is triggering this first 15+ seconds query, and not the rset.getMetaData().getColumnCount(); ?
With respect to the SELECT statement, at least initially, it is ambiguous which catalog or schema the table resides in, so the driver needs to validate metadata to ensure the table and columns exist. But, the rset.getMetaData().getColumnCount(), explicitly fetches metadata for the query's result set. Internally, it uses DatabaseMetaData.getColumns() or similar operations which could be invoking the SparkGetColumnsOperation class. The metadata operation retrieves schema information about the columns in the result set (names, types, nullability, etc.), even if the query result is a single column like count(*). I would expect this to be more probable trigger for the metadata query as it ensures the ResultSet is correctly structured.
Please try some of these and compare the results:
- Skip the metadata call and directly process the ResultSet:
- final ResultSet rset = statement.executeQuery("SELECT count(*) FROM table_name LIMIT 10"); while (rset.next()) { System.out.println(rset.getInt(1)); }
- Use the FQN:
- final ResultSet rset = statement.executeQuery("SELECT count(*) FROM my_catalog.my_schema.table_name LIMIT 10");
- Use PreparedStatements:
-
String query = "SELECT count(*) FROM table_name LIMIT 10";
PreparedStatement preparedStatement = connection.prepareStatement(query);
ResultSet rset = preparedStatement.executeQuery();
while (rset.next()) {
System.out.println(rset.getInt(1));
}
So, ultimately something like this:
String TOKEN = "<token>";
String HTTP_PATH = "/sql/1.0/warehouses/<sql-warehouse-id>";
final String connStr = "jdbc:spark://discover.cloud.databricks.com:443/default;" +
"TransportMode=http;SSL=1;" +
"HTTPPath=" + HTTP_PATH + ";" +
"AuthMech=11;" +
"Auth_Flow=0;" +
"Auth_AccessToken=" + TOKEN + ";";
try (Connection connection = DriverManager.getConnection(connStr);
Statement statement = connection.createStatement()) {
final ResultSet rset = statement.executeQuery("SELECT count(*) FROM my_catalog.my_schema.table_name LIMIT 10");
while (rset.next()) {
System.out.println(rset.getInt(1)); // Directly fetch the result
}
} catch (SQLException e) {
e.printStackTrace();
}