07-26-2024 03:44 PM
I'm trying to use Databricks Connect to run queries on Delta Tables locally. However, SQL queries using spark.sql don't seem to work properly, even though spark.read.table works.
>>> from databricks.connect import DatabricksSession
>>> spark = DatabricksSession.builder.profile("<profile name>").serverless().getOrCreate()
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1722031752.676173 13015515 config.cc:230] gRPC experiments enabled: call_status_override_on_cancellation, event_engine_dns, event_engine_listener, http2_stats_fix, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache
>>> spark.read.table('<table name>')
DataFrame[...]
>>> spark.sql("SELECT * FROM <table name>")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/adawson/.pyenv/versions/pyspark/lib/python3.10/site-packages/pyspark/sql/connect/session.py", line 733, in sql
data, properties = self.client.execute_command(cmd.command(self._client))
File "/Users/adawson/.pyenv/versions/pyspark/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1270, in execute_command
data, _, _, _, properties = self._execute_and_fetch(
File "/Users/adawson/.pyenv/versions/pyspark/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1717, in _execute_and_fetch
for response in self._execute_and_fetch_as_iterator(
File "/Users/adawson/.pyenv/versions/pyspark/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1693, in _execute_and_fetch_as_iterator
self._handle_error(error)
File "/Users/adawson/.pyenv/versions/pyspark/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 2009, in _handle_error
self._handle_rpc_error(error)
File "/Users/adawson/.pyenv/versions/pyspark/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 2097, in _handle_rpc_error
raise convert_exception(
pyspark.errors.exceptions.connect.ParseException:
[PARSE_EMPTY_STATEMENT] Syntax error, unexpected empty statement. SQLSTATE: 42617 (line 1, pos 0)
== SQL ==
^^^
Has anyone else run into this issue? I'm on an M2 Pro Mac, running Python 3.10, pyspark 3.5.1, databricks-connect 15.3.1, databricks-sdk 0.29.0, Databricks CLI v0.224.1.
09-02-2024 04:21 AM
Hi everyone!
I am an engineer working on Databricks Connect. This error appears because of the incompatibility between the Serverless Compute and Databricks Connect versions. The current Serverless Compute release roughly corresponds to Databricks Runtime 15.1, while you are using Databricks Connect 15.3. We guarantee forward compatibility in Databricks Connect Python, which means that Databricks Connect version should be lower than or equal to the Serverless Runtime version. You should switch to using DBConnect 15.1 for now, until Serverless Compute is upgraded.
Serverless Compute releases currently lag behind Classic DBR releases. We are working on reducing the delay for Serverless Compute releases.
You can find the Runtime version used by Serverless compute in these release notes https://docs.databricks.com/en/release-notes/serverless.html#version-202430
and on this page https://docs.databricks.com/en/dev-tools/databricks-connect/python/install.html
08-01-2024 08:43 AM
Hi Kaniz,
Could you clarify what you mean? I assume the session is correctly initialized since I can use `spark.read.table` without issue, but I don't know what else I should check.
08-15-2024 05:22 AM
Have the exact same issue, session seems to be working spark.table("mytable").show() produces results.
Can't get any sql statement to work.
09-10-2024 03:25 AM
Try using the latest version of databricks-connect for that runtime if you are using cluster, for example if using 14.3 Runtime then use databricks-connect 14.3.2(Python) as for older version I faces the same issue.
09-02-2024 04:21 AM
Hi everyone!
I am an engineer working on Databricks Connect. This error appears because of the incompatibility between the Serverless Compute and Databricks Connect versions. The current Serverless Compute release roughly corresponds to Databricks Runtime 15.1, while you are using Databricks Connect 15.3. We guarantee forward compatibility in Databricks Connect Python, which means that Databricks Connect version should be lower than or equal to the Serverless Runtime version. You should switch to using DBConnect 15.1 for now, until Serverless Compute is upgraded.
Serverless Compute releases currently lag behind Classic DBR releases. We are working on reducing the delay for Serverless Compute releases.
You can find the Runtime version used by Serverless compute in these release notes https://docs.databricks.com/en/release-notes/serverless.html#version-202430
and on this page https://docs.databricks.com/en/dev-tools/databricks-connect/python/install.html
09-11-2024 03:13 PM
Thank you so much! That did the trick.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group