setup justfile command in order to launch your spark application
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Hello Guys,
Actually, I build a just file for my project which will be execute my wheel job task using command line, but when i run my wheel task i have encountered this error.
from pyspark.sql.connect.expressions import PythonUDFEnvironment
ImportError: cannot import name 'PythonUDFEnvironment' from 'pyspark.sql.connect.expressions
Someone know how to solve this issue?
justfile
default:
@just --list
install:
poetry install
poetry-remove-pyspark:
poetry show pyspark # Is PySpark already installed?
poetry-uninstall-pyspark:
poetry remove pyspark
test-connexion-databricks:
databricks-connect test
poetry-install-pyspark:
poetry add "pyspark (>=3.5.5,<=4.1.1)"
poetry-add-pyspark-connect:
poetry add databricks-connect@~17.3 # Or X.Y to match your cluster version.
edpiqual-example:
python -c "from edpiqual.entrypoint import main; main()" \
--ingestion_catalog_name default \
--product_catalog_name workspace \
--data_bundle Transactions --data_bundle_object test1\
--format csv --source_type autoloader --trigger availableNow \
--agreement_version 1 --start_date 2026-03-08T04:01:40.285Z \
--write_mode upsert --upsert_column reference_id
this is my just file
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a week ago
This error typically happens when there’s a version mismatch between your local pyspark installation and databricks-connect.
PythonUDFEnvironment was introduced in a specific version of the Databricks Connect SDK. If you have a standalone pyspark package installed alongside databricks-connect, it shadows the correct one bundled with the connector.
If this is the main issue: try to remove pyspark, remove only databricks-connect. Via poetry, verify if pyspark is not installed (poetry show).