We are experiencing the following issues.
Description:
I encountered an issue while executing a Spark SQL query in Databricks, and it seems to be related to the query optimization phase. The error message suggests an internal bug within Spark or the Spark plugins used in our environment. I am reaching out to report this issue and seek assistance in troubleshooting and resolving it.
Error Message:
[INTERNAL_ERROR] The Spark SQL phase optimization failed with an internal error. You hit a bug in Spark or the Spark plugins you use. Please, report this bug to the corresponding communities or vendors, and provide the full stack trace. SQLSTATE: XX000
Environment Details:
- Cloud: Azure
- Compute: SQL Warehouse, Serverless
- Channel: (v 2024.35)
- Size: X-Small
Query:
Here is the SQL query that triggered the internal error:
WITH isos AS ( SELECT DISTINCT Name AS full_name, CAST(RecordingTime AS DATE) AS test_date, LOWER(TestType) AS test_type, TestTypeName, CASE WHEN LOWER(TestTypeName) = 'slppf' OR LOWER(TestType) = 'rsaip' THEN 'ankle iso push' WHEN LOWER(TestTypeName) = 'slsquat' OR LOWER(TestType) = 'rskip' THEN 'knee iso push' WHEN LOWER(TestTypeName) = 'slhipext' OR LOWER(TestType) = 'rship' THEN 'hip iso push' ELSE LOWER(TestTypeName) END AS test_type_name, Limb, `Peak Vertical Force / BW` AS peak_vertical_force FROM <catalog>.<schema>.<table> ), isos_next AS ( SELECT * FROM isos
WHERE test_type_name = 'hip iso push' ) SELECT * FROM isos_next;
Steps to Reproduce:
- Executed the above SQL query within a Databricks notebook.
- The error occurs during query execution, specifically during the Spark SQL optimization phase.
Troubleshooting Attempts:
- I simplified the query to isolate the issue, and it appears to be related to the internal Spark SQL optimizer.
- I've checked for any unsupported functions or potential syntax issues, but the query seems valid based on Spark SQL syntax.
Request:
I believe this may be a bug in Spark or the Databricks runtime, and I would appreciate guidance on resolving it. If further investigation is required, I am happy to provide any additional details or logs.
Let's talk about data-powered performance