Different behavior on personal cluster vs job cluster

FRB1984 — Wed, 20 Aug 2025 16:58:58 GMT

Hi guys!
I am facing a weird bug here!
I own a notebook that runs perfectly on personal cluster. Just as example, I´ve made some prints of the data output during the extraction :

code :

cursor.execute(sql) results = cursor.fetchall() cols = [desc[0] for desc in cursor.description] dfspark = spark.createDataFrame(results, cols)

Output in personal cluster:

As you can see, when running in job cluster, the data is not being converted to da spark dataframe (and being held as pyspark.sql.connect.dataframe.DataFrame).

Re: Different behavior on personal cluster vs job cluster

Vidhi_Khaitan — Thu, 21 Aug 2025 04:55:13 GMT

Hi team,

In interactive notebooks on personal clusters, you’re attached directly to the Spark driver inside the cluster. Spark session is the legacy PySpark session.
In job clusters, especially when running with newer runtimes (e.g. DBR 14.x+ or SQL warehouses), Databricks may automatically use Spark Connect. In this case, your client (pyspark.sql.connect) holds the DataFrame object, and operations get lazily pushed to the remote Spark cluster.

topic Re: Different behavior on personal cluster vs job cluster in Data Engineering

Different behavior on personal cluster vs job cluster

Re: Different behavior on personal cluster vs job cluster