cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

spark.sql not supporting kwargs as documented

brianbraunstein
New Contributor II
This documentation https://api-docs.databricks.com/python/pyspark/latest/pyspark.sql/api/pyspark.sql.SparkSession.sql.h... claims that spark.sql() should be able to take kwargs, such that the following should work:
display(spark.sql('SELECT 9+{amount}', amount=88))
However, instead it produces this error:
TypeError: SparkSession.sql() got an unexpected keyword argument 'amount'
 
 
Ultimately I would like to use the kwargs feature to be able to SQL over a dataframe, like the documented example on the page linked above:
mydf = spark.range(10)
spark.sql(
"SELECT {col} FROM {mydf} WHERE id IN {x}",
col=mydf.id, mydf=mydf, x=tuple(range(4))).show()

Obviously the example gives the same error.  I just provided the first code as a simplified demonstration of the problem.

 
My databricks-sdk is up to date:
%pip install --upgrade databricks-sdk
dbutils.library.restartPython()
%pip show databricks-sdk

...
Name: databricks-sdk
Version: 0.26.0
Summary: Databricks SDK for Python (Beta)
Home-page: https://databricks-sdk-py.readthedocs.io
Author: Serge Smertin
Author-email: serge.smertin@databricks.com
License: UNKNOWN
Location: /local_disk0/.ephemeral_nfs/envs/pythonEnv-07bec08e-988d-4ca2-8229-83561254d3c0/lib/python3.10/site-packages
Requires: google-auth, requests
Required-by:
 
My spark version is up to date:
display(spark.sql('SELECT version()'))

3.5.0 0000000000000000000000000000000000000000
 

 

1 REPLY 1

brianbraunstein
New Contributor II

Ok, it looks like Databricks might have broken this functionality shortly after it came out: https://community.databricks.com/t5/data-engineering/parameterized-spark-sql-not-working/m-p/57969/h...