I am trying to run the following chunk of code in the cell of a Databricks notebook (using Databricks runtime 14.3 LTS, Apache spark 3.5.0, scala 2.12):
spark.sql("CREATE OR REPLACE table sample_catalog.sample_schema.sample_table_tmp AS SELECT * FROM sample_catalog.sample_schema.sample_table")
df = spark.sql("SELECT * FROM sample_catalog.sample_schema.sample_table_tmp")
if df is not None:
if not df.isEmpty():
display(df)
del(df)
spark.sql("DROP TABLE IF EXISTS sample_catalog.sample_schema.sample_table_tmp")
After successfully running the "display() function, the above code errors out, during the last line when it is trying to drop the table, with the following message:
2024-02-22 17:12:10,508 34507 ERROR _handle_rpc_error GRPC Error received
Traceback (most recent call last):
File "/databricks/spark/python/pyspark/sql/connect/client/core.py", line 1312, in _analyze
resp = self._stub.AnalyzePlan(req, metadata=self._builder.metadata())
File "/databricks/python/lib/python3.10/site-packages/grpc/_channel.py", line 946, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/databricks/python/lib/python3.10/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INTERNAL
details = "[TABLE_OR_VIEW_NOT_FOUND] The table or view `sample_catalog`.`sample_schema`.`sample_table_tmp` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS. SQLSTATE: 42P01; line 1 pos 14;
'Project [*]
+- 'UnresolvedRelation [sample_catalog, sample_schema, sample_table_tmp], [], false
"
debug_error_string = "UNKNOWN:Error received from peer unix:/databricks/sparkconnect/grpc.sock {created_time:"2024-02-22T17:12:10.508374255+00:00", grpc_status:13, grpc_message:"[TABLE_OR_VIEW_NOT_FOUND] The table or view `sample_catalog`.`sample_schema`.`sample_table_tmp` cannot be found. Verify the spelling and correctness of the schema and catalog.\nIf you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.\nTo tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS. SQLSTATE: 42P01; line 1 pos 14;\n\'Project [*]\n+- \'UnresolvedRelation [sample_catalog, sample_schema, sample_table_tmp], [], false\n"}"
It's not clear to me that the error message is accurately explaining the problem here.
1) The table in question does not appear to be missing, as the display() command worked fine, so that table was clearly visible,
2) the last command uses "DROP TABLE IF EXISTS", not "DROP TABLE", and
3) the exact same code above works fine when I move the last line of code (i.e. the "spark.sql("DROP TABLE ...)") into a subsequent cell, and run the the two cells consecutively.
I'm wondering if there is something going on behind the scenes with the data distributing, that is complicating things. Could anyone please explain to me what is causing this error?