Getting python version errors when using pyspark rdd using databricks connect
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-18-2024 04:21 AM - edited 04-18-2024 04:22 AM
Hi community,
When I use pyspark rdd related functions in my environment using databricks connect, I get below error:
Databricks cluster version: 12.2.
`RuntimeError: Python in worker has different version 3.9 than that in driver 3.10, PySpark cannot run with different minor versions. Please check environment variables PYSPARK PYTHON...`
How can I resolve it?
- Labels:
-
Databricks connect
-
Databricks Pyspark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-01-2024 10:55 PM
Got it. As a side note, I tried above methods, but the error persisted, hence upon reading docs again, there was this statement: You must install Python 3 on your development machine, and the minor version of your client Python installation must be the same as the minor Python version of your Databricks cluster. (Link: https://docs.databricks.com/en/dev-tools/databricks-connect-legacy.html#requirements).
On aligning my environment python version with Databricks cluster python version, the error got resolved.

