This ImportError happens because you have both standalone pyspark and databricks-connect installed, and they conflict with each other. databricks-connect bundles its own version of PySpark internally — when the standalone pyspark package is also present, Python imports from the wrong one, which doesn't have PythonUDFEnvironment.
Fix: Remove the standalone pyspark and only use databricks-connect:
# Remove standalone pyspark first
poetry-remove-pyspark:
poetry remove pyspark
# Install databricks-connect (which bundles compatible pyspark)
poetry-add-databricks-connect:
poetry add databricks-connect@~17.3
# Verify no standalone pyspark is installed
check-deps:
poetry show pyspark 2>&1 || echo "OK: no standalone pyspark"
poetry show databricks-connect
Key rules to avoid this:
- Never install `pyspark` alongside `databricks-connect` — they conflict
- databricks-connect version must match your cluster DBR version (e.g., ~17.3 for DBR 17.3)
- After removing pyspark, clear any cached .pyc files: find . -name "*.pyc" -delete
If you need standalone PySpark for local-only testing (no Databricks), keep them in separate Poetry dependency groups and never activate both simultaneously.
Anuj Lathi
Solutions Engineer @ Databricks