Databricks Community

FRB1984 · ‎08-20-2025

Hi guys!
I am facing a weird bug here!
I own a notebook that runs perfectly on personal cluster. Just as example, I´ve made some prints of the data output during the extraction :

code :

cursor.execute(sql) 
results = cursor.fetchall() 
cols = [desc[0] for desc in cursor.description] 
dfspark = spark.createDataFrame(results, cols)

Output in personal cluster:

<class 'pyspark.sql.connect.dataframe.DataFrame'>

As you can see, when running in job cluster, the data is not being converted to da spark dataframe (and being held as pyspark.sql.connect.dataframe.DataFrame).

Vidhi_Khaitan · ‎08-20-2025

Hi team,

In interactive notebooks on personal clusters, you’re attached directly to the Spark driver inside the cluster. Spark session is the legacy PySpark session.
In job clusters, especially when running with newer runtimes (e.g. DBR 14.x+ or SQL warehouses), Databricks may automatically use Spark Connect. In this case, your client (pyspark.sql.connect) holds the DataFrame object, and operations get lazily pushed to the remote Spark cluster.

Databricks Community

Different behavior on personal cluster vs job cluster

Join Us as a Local Community Builder!

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

🌟 Community Pulse: Your Weekly Roundup! November 14 – 20, 2025

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples