Issue during testing SparkSession.sql() with pytest.

Rafal9
New Contributor III

Dear Community,

I am testing pyspark code via pytest using VS code and Databricks Connect.

SparkSession is initiated from Databricks Connect:

 

from databricks.connect import DatabricksSession
spark = DatabricksSession.builder.getOrCreate()

I am  receiving every time error message when I am calling 'SparkSession.sql()' method.

For example:

 

 

 

# module.py
def create_catalog(spark_session):
    """Doc string"""
    spark_session.sql("""CREATE CATALOG IF NOT EXISTS test_catalog""")

# test_module.py

from module import create_catalog

@pytest.fixture(scope="session")
def spark_session():
    """Creates SparkSession."""

    global spark
    try:
        spark
    except NameError:
        from databricks.connect import DatabricksSession
        spark = DatabricksSession.builder.getOrCreate()
    yield spark

def test_create_catalog(spark_session):
    """Doc string"""
    create_catalog(spark_session)
    

 

 

 

I am receiving following error message:

 

 

 

pyspark.errors.exceptions.connect.SparkConnectGrpcException: <_MultiThreadedRendezvous of RPC that terminated with:
E               status = StatusCode.UNIMPLEMENTED
E               details = "Method not found: spark.connect.SparkConnectService/ReattachExecute"
E               debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2023-11-04T16:14:26.2187837+00:00", grpc_status:12, grpc_message:"Method 
not found: spark.connect.SparkConnectService/ReattachExecute"}"

 

 

 

Issue occurs also when I am using SparkSession directly and not as a fixture.

I have tested and SparkSession.sql() created from databricks.connect works correctly when I am runing code via 'Run file as a Workflow on Databricks.' from VS Code.

Thank you in advance for any help,

Rafal