05-12-2024 10:40 PM
The following code throws an error locally in my IDE with Databricks-connect.
from databricks.connect import DatabricksSession
spark = DatabricksSession.builder.getOrCreate()
spark.sql("CREATE DATABASE IF NOT EXISTS sample")
spark.sql("DROP TABLE IF EXISTS sample.mvp")
spark.sql("DROP TABLE IF EXISTS sample.mvp_from_foreach_batch")
data = [("John", "Doe", 30), ("Jane", "Doe", 25), ("Mike", "Johnson", 35)]
df = spark.createDataFrame(data, ["FirstName", "LastName", "Age"])
df.write.format("delta").mode("overwrite").saveAsTable("sample.mvp")
def foreach_batch_function(df, epoch_id):
df.write.format("delta").mode("overwrite").saveAsTable(
"sample.mvp_from_foreach_batch"
)
spark.readStream.table("sample.mvp").writeStream.foreachBatch(
foreach_batch_function
).outputMode("append").trigger(availableNow=True).start().awaitTermination()
This code only works in notebooks or directly on a cluster. It will not run locally in an IDE with Databricks Connect.
Instead error
pyspark.errors.exceptions.connect.SparkException: No PYTHON_UID found for session (some uid) is raised
In general, Databricks Connect works fine for all other cases.
My local environment:
Cluster running on
05-21-2024 05:50 AM
Hi @TWib,
databricks-connect configure
command and cr...Your local environment details look fine but double-check the points above to resolve the error. If you need further assistance, feel free to ask!
05-21-2024 06:09 AM
Only things differs is Python 3.11.0 on Cluster vs. 3.11.4 locally. This shouldnt be an issue.
Does this code run for you?
05-21-2024 06:38 AM - edited 05-21-2024 06:39 AM
05-21-2024 10:21 PM
@Kaniz_Fatma Notebooks in Databricks Workspace are also working for me (this was never the problem)
Locally in VSCode with DataBricks Connect it fails
05-27-2024 10:28 PM
One more finding: It seems only to occur in single user cluster.
05-30-2024 10:27 PM
This is still unresolved. Internally we have dropped streaming for now because of so many problems, another ticket with support is open.
Currently I do not recommend using streaming with foreach if you want to use databricks connect.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group