Hi guys!
I am facing a weird bug here!
I own a notebook that runs perfectly on personal cluster. Just as example, Iยดve made some prints of the data output during the extraction :
code :
cursor.execute(sql)
results = cursor.fetchall()
cols = [desc[0] for desc in cursor.description]
dfspark = spark.createDataFrame(results, cols)โ
Output in personal cluster:
<class 'pyspark.sql.connect.dataframe.DataFrame'>
As you can see, when running in job cluster, the data is not being converted to da spark dataframe (and being held as pyspark.sql.connect.dataframe.DataFrame).