- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-01-2025 11:15 AM
I am reading data from Databricks in datatstage 11.7 on prem using datastage JDBC connector and getting the below error. I tried to limit the select queries to one row , it was able to read data form the source,
JDBC_Connector_0: The connector encountered a Java exception:
java.sql.SQLException: [Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: null, Query: select cola,colb from table1, Error message from Server: Configuration AutoCommit is not available..
I am using the latest JDBC driver available on databricks and done the required configuration. any assistance on this would be of great help
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-01-2025 11:49 AM
Greetings Fuzail, here are some suggestions you might want to consider:
Analysis and Recommendations:
-
Auto-Commit Behavior in Databricks JDBC Driver:
- The Databricks JDBC driver operates with auto-commit mode enabled by default. This means that manual commit operations are generally not supported, as indicated by related error messages like "Cannot use commit while connection is in auto-commit mode" from similar issues.
-
Resolution Steps:
- Ensure that the DataStage JDBC properties explicitly set the
AutoCommitparameter totruein its connection settings to align with the Databricks JDBC driver's behavior. This adjustment should prevent the connector from attempting manual commits, which are not supported. - Refer to the DataStage and Databricks configuration documentation to locate where these connection properties can be explicitly defined, such as in the configuration wizard or JDBC connection string.
- Ensure that the DataStage JDBC properties explicitly set the
-
Additional Debugging:
- If the error persists, verify that the latest Databricks JDBC driver version is being used, as updates might include fixes for such compatibility issues. The latest versions are recommended for addressing known problems.
- Cross-check the logs to identify whether any test queries fired by DataStage during connection initiation might conflict with the driver's auto-commit behavior.
-
Documentation:
- See the Databricks JDBC Driver Installation and Configuration Guide for further insights on how the driver handles transaction-related operations and auto-commit settings.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-01-2025 04:29 PM
Thank you so much for the quick help. setting auto commit to true resolved the issue. I just have one more followup question, The update to databricks using JDBC is taking very longer time and looks like its processing row by row, I tried to adjust the setting of the connector but does not help. From the datastage log i can see "The driver does not support batch updates. The connector will enforce the batch size value of 1." Is there any possible workaround for this issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-02-2025 10:12 AM
I just have one more followup question, The update to databricks using JDBC is taking very longer time and looks like its processing row by row, I tried to adjust the setting of the connector but does not help. From the datastage log i can see "The driver does not support batch updates. The connector will enforce the batch size value of 1." Is there any possible workaround for this issue @Louis_Frolio , can you provide your suggestion for this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-02-2025 10:33 AM
Here are some suggestions, not sure if it fits with what you are doing but they are worth mentioning.
-
Driver Limitation:
- The Databricks JDBC driver enforces a batch size of 1 for updates because it does not currently support batch operations in auto-commit mode.
-
Workarounds:
- Use
COPY INTO: Databricks supports theCOPY INTOcommand, which can handle bulk data ingestion efficiently. This approach sidesteps the limitations of JDBC for batch updates. - Batch Inserts Using Spark SQL: You can implement a workaround by inserting multiple rows in a single SQL statement via Spark SQL's
VALUESclause. For instance, you can construct anINSERT INTOstatement that batches hundreds of rows within a single operation. Note that you may need additional logic to handle splitting large jobs into manageable chunks. - Programmatic Ingestion: If
COPY INTOor Spark SQL is not feasible, consider using Databricks' supported ingestion methods like DataFrames or Delta Lake APIs for optimized data writes.
- Use