cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

SparkSession conf : strange bug when injecting property "hourl" in SQL query

sfalquier
New Contributor II

Assuming you have a catalog "my_catalog" and a schema "my_schema", the following code is not working : 

 

 

 

full_table_location = "`my_catalog`.`my_schema`.`my_table_hourl`"
spark.conf.set("fullTableName", full_table_location)
spark.sql("""SELECT * FROM ${fullTableName} LIMIT 1""")

 

 

 

... not because table is not found but because table name contains "hourl" string. It generates the following ParseException because my_catalog.my_schema.my_table_hourl has been obfuscated with *********(redacted).

 

 

 

ParseException                            Traceback (most recent call last)
File <command-1602645165993644>, line 1
----> 1 spark.sql("""SELECT * FROM ${a} LIMIT 1""")

File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.<locals>.wrapper(*args, **kwargs)
     46 start = time.perf_counter()
     47 try:
---> 48     res = func(*args, **kwargs)
     49     logger.log_success(
     50         module_name, class_name, function_name, time.perf_counter() - start, signature
     51     )
     52     return res

File /databricks/spark/python/pyspark/sql/session.py:1602, in SparkSession.sql(self, sqlQuery, args, **kwargs)
   1598         assert self._jvm is not None
   1599         litArgs = self._jvm.PythonUtils.toArray(
   1600             [_to_java_column(lit(v)) for v in (args or [])]
   1601         )
-> 1602     return DataFrame(self._jsparkSession.sql(sqlQuery, litArgs), self)
   1603 finally:
   1604     if len(kwargs) > 0:

File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
   1316 command = proto.CALL_COMMAND_NAME +\
   1317     self.command_header +\
   1318     args_command +\
   1319     proto.END_COMMAND_PART
   1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
   1323     answer, self.gateway_client, self.target_id, self.name)
   1325 for temp_arg in temp_args:
   1326     if hasattr(temp_arg, "_detach"):

File /databricks/spark/python/pyspark/errors/exceptions/captured.py:194, in capture_sql_exception.<locals>.deco(*a, **kw)
    190 converted = convert_exception(e.java_exception)
    191 if not isinstance(converted, UnknownException):
    192     # Hide where the exception came from that shows a non-Pythonic
    193     # JVM exception message.
--> 194     raise converted from None
    195 else:
    196     raise

ParseException: 
[PARSE_SYNTAX_ERROR] Syntax error at or near '*'.(line 1, pos 14)

== SQL ==
SELECT * FROM *********(redacted) LIMIT 1
--------------^^^

 

 

 

 Since it works properly with anything but "hourl" substring, any idea of what's going wrong here ?

 

0 REPLIES 0