cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

SQL Warehouse: Retrieving SQL ARRAY Type via JDBC driver

wilco
New Contributor II

Hi all,

we are currently running into the following issue

  • we are using serverless SQL warehouse
  • in a JAVA application we are using the latest Databricks JDBC driver (v2.6.36)
  • we are querying the warehouse with a collect_list function, which should return an ARRAY type
  • but instead we are getting a Json List as String

We are trying the query which is mentioned in the documentation

 

SELECT collect_list(col) FROM VALUES (1), (2), (NULL), (1) AS tab(col);

 

The MetaData instance of the ResultSet instance indicates that the java.sql.Types constant for the retrieved column is VARCHAR.

 

String query = "SELECT collect_list(col) result_value FROM VALUES (1), (1), (NULL), (2) AS tab(col)";
ResultSet resultSet = connection.createStatement().executeQuery(query);
ResultSetMetaData resultSetMetaData = resultSet.getMetaData();
int columnType = resultSetMetaData.getColumnType(1);
// columnType equals java.sql.Types.VARCHAR with int value == 12

 

Running the same query in the Databricks SQL Editor in the browser returns the correct type (of course).

Any idea how to get the array type with using the JDBC driver?

Many thanks in advance!

3 REPLIES 3

Kaniz_Fatma
Community Manager
Community Manager

Hi @wilcoIt appears that you’re encountering an issue with the Databricks JDBC driver when retrieving an ARRAY type using the collect_list function.

Let’s explore some steps to address this:

  1. JDBC Driver Version: Ensure that you’re using the latest version of the Databricks JDBC driver. You mentioned using version 2.6.36, which is great. However, it’s always a good practice to verify if there are any newer versions available. You can find the latest driver on the Databricks JDBC Driver documentation page1.

  2. ResultSet Metadata: The issue seems to be related to the metadata returned by the ResultSet. When you execute the query, the column type is reported as java.sql.Types.VARCHAR with an integer value of 12. This indicates that the result is being treated as a string (VARCHAR) rather than an ARRAY.

  3. Data Type Mapping: The JDBC driver should automatically map the ARRAY type correctly. However, let’s explicitly specify the data type in your query to ensure proper handling. Modify your query as follows:

    SELECT collect_list(col) AS result_value FROM VALUES (1), (1), (NULL), (2) AS tab(col);
    

    By adding the AS result_value, we explicitly define the column name and its expected data type.

  4. Check the Result: After making the modification, re-run the query using the updated JDBC driver. Verify that the result now correctly returns an ARRAY type. You can also check the metadata again to ensure that the column type reflects the correct data type.

Remember that the Databricks SQL Editor in the browser may handle the result differently, so it’s essential to focus on the behavior within your Java application using the JDBC driver. 

Best of luck, and I hope this helps you resolve the issue! 🚀

 

wilco
New Contributor II

Hi @Kaniz_Fatma 

many thanks for your response. I followed your proposed steps to resolve the issue

  1. We use the latest released driver version which is 2.6.36: https://www.databricks.com/spark/jdbc-drivers-archive 
  2. yes. this is indeed the issue: not an ARRAY is returned but a VARCHAR type
  3. Unfortunately with explicitely defining the result name does not change the resulting type. I don´t see that the expected data type is specified with adding AS result_value
  4. Unfortunately nothing changed

Does the underlying Simba Driver support complex types such as ARRAY?

 

KTheJoker
Contributor II
Contributor II

Hey Wilco,

The answer is no, ODBC/JDBC don't support complex types so these need to be compressed into strings over the wire (usually in JSON representation) and rehydrated on the client side into a complex object.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!