Databricks Community

Karlo_Kotarac · ‎06-07-2024

Hi! We want to upgrade the DB runtime on our clusters from 13.3 LTS to 14.3 LTS. Currently, everything looks good except for the different error-handling in the new runtime.

For example, the error in the 13.3 LTS runtime looks familiar:

while the same code on 14.3 LTS runtime throws the following error:

Only after digging deeper into error logs, can I see that the error is the same in this case:

Not sure if it is important information but we use the spark.sql() function when calling merge into command. Is there a way to restore the previous error-handling behaviour because current errors are not informative?

Yeshwanth · ‎06-07-2024

@Karlo_Kotarac Where do you see this error:

Karlo_Kotarac · ‎06-07-2024

In the notebook where this error happened below the merge statement after expanding the error message

The complete error is then the following:

TypeError: AutoFormattedTB.structured_traceback() missing 1 required positional argument: 'evalue'
    [... skipping hidden 1 frame]
File <command-77793784558808>, line 1
----> 1 merge_sql_v2(
      2   target_table = "silver.test",
      3   update_table = "updates_id",
      4   keycolumns = "id",
      5   history = False,
      6   dryrun = False
      7 )
File <command-77793784557616>, line 134, in merge_sql_v2(target_table, update_table, keycolumns, history, dryrun)
    133 if dryrun == False:
--> 134   spark.sql(mergesql).display()
    135   spark.sql(deletesql).display()
File /databricks/spark/python/pyspark/instrumentation_utils.py:47, in _wrap_function.<locals>.wrapper(*args, **kwargs)
     46 try:
---> 47     res = func(*args, **kwargs)
     48     logger.log_success(
     49         module_name, class_name, function_name, time.perf_counter() - start, signature
     50     )
File /databricks/spark/python/pyspark/sql/session.py:1748, in SparkSession.sql(self, sqlQuery, args, **kwargs)
   1745         litArgs = self._jvm.PythonUtils.toArray(
   1746             [_to_java_column(lit(v)) for v in (args or [])]
   1747         )
-> 1748     return DataFrame(self._jsparkSession.sql(sqlQuery, litArgs), self)
   1749 finally:
File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1355, in JavaMember.__call__(self, *args)
   1354 answer = self.gateway_client.send_command(command)
-> 1355 return_value = get_return_value(
   1356     answer, self.gateway_client, self.target_id, self.name)
   1358 for temp_arg in temp_args:
File /databricks/spark/python/pyspark/errors/exceptions/captured.py:230, in capture_sql_exception.<locals>.deco(*a, **kw)
    227 if not isinstance(converted, UnknownException):
    228     # Hide where the exception came from that shows a non-Pythonic
    229     # JVM exception message.
--> 230     raise converted from None
    231 else:
UnsupportedOperationException: [DELTA_MULTIPLE_SOURCE_ROW_MATCHING_TARGET_ROW_IN_MERGE] Cannot perform Merge as multiple source rows matched and attempted to modify the same
target row in the Delta table in possibly conflicting ways. By SQL semantics of Merge,
when multiple source rows match on the same target row, the result may be ambiguous
as it is unclear which source row should be used to update or delete the matching
target row. You can preprocess the source table to eliminate the possibility of
multiple matches. Please refer to
https://docs.microsoft.com/azure/databricks/delta/merge#merge-error

During handling of the above exception, another exception occurred:
Py4JError                                 Traceback (most recent call last)
File /databricks/python/lib/python3.10/site-packages/IPython/core/interactiveshell.py:1975, in InteractiveShell.set_custom_exc.<locals>.wrapped(self, etype, value, tb, tb_offset)
   1974 try:
-> 1975     stb = handler(self,etype,value,tb,tb_offset=tb_offset)
   1976     return validate_stb(stb)
File /databricks/python_shell/dbruntime/ExceptionHandler.py:26, in custom_exception_handler(shell, etype, exception, tb, tb_offset)
     21 data = {
     22     'errorClass': exception.getErrorClass(),
     23     'messageParameters': exception.getMessageParameters(),
     24     'sqlState': exception.getSqlState(),
     25 }
---> 26 query_contexts = exception.getQueryContext()
     27 if len(query_contexts) != 0:
File /databricks/spark/python/pyspark/errors/exceptions/captured.py:150, in CapturedException.getQueryContext(self)
    147 if self._origin is not None and is_instance_of(
    148     gw, self._origin, "org.apache.spark.SparkThrowable"
    149 ):
--> 150     return [QueryContext(q) for q in self._origin.getQueryContext()]
    151 else:
File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1355, in JavaMember.__call__(self, *args)
   1354 answer = self.gateway_client.send_command(command)
-> 1355 return_value = get_return_value(
   1356     answer, self.gateway_client, self.target_id, self.name)
   1358 for temp_arg in temp_args:
File /databricks/spark/python/pyspark/errors/exceptions/captured.py:224, in capture_sql_exception.<locals>.deco(*a, **kw)
    223 try:
--> 224     return f(*a, **kw)
    225 except Py4JJavaError as e:
File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py:330, in get_return_value(answer, gateway_client, target_id, name)
    329     else:
--> 330         raise Py4JError(
    331             "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n".
    332             format(target_id, ".", name, value))
    333 else:
Py4JError: An error occurred while calling o514.getQueryContext. Trace:
py4j.Py4JException: Method getQueryContext([]) does not exist
	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:344)
	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:352)
	at py4j.Gateway.invoke(Gateway.java:297)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:119)
	at java.lang.Thread.run(Thread.java:750)


During handling of the above exception, another exception occurred:
TypeError                                 Traceback (most recent call last)
    [... skipping hidden 1 frame]
File /databricks/python/lib/python3.10/site-packages/IPython/core/interactiveshell.py:1985, in InteractiveShell.set_custom_exc.<locals>.wrapped(self, etype, value, tb, tb_offset)
   1983     print(self.InteractiveTB.stb2text(stb))
   1984     print("The original exception:")
-> 1985     stb = self.InteractiveTB.structured_traceback(
   1986                             (etype,value,tb), tb_offset=tb_offset
   1987     )
   1988 return stb

Hope this helps.

Databricks Community

Different error handling behavior after DB runtime upgrade from 13.3 to 14.3

Join Us as a Local Community Builder!

🚀 Weekly Delta (1 - 7 October): A Look Back at This Week’s Top Community Highlights!

🌟 Community Sparks of the Week | September 26 – October 2 🌟

Solution Accelerator Series | #4 - Toxicity Detection for Gaming

Level Up with Databricks Specialist Sessions

Announcing Data Intelligence for Cybersecurity