cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

setup justfile command in order to launch your spark application

seefoods
Valued Contributor

Hello Guys, 


Actually, I build a just file for my project which will be execute my wheel job task using command line, but when i run my wheel task i have encountered this error. 
from pyspark.sql.connect.expressions import PythonUDFEnvironment
ImportError: cannot import name 'PythonUDFEnvironment' from 'pyspark.sql.connect.expressions

Someone know how to solve this issue? 

justfile 

default:
@just --list

install:
poetry install

poetry-remove-pyspark:
poetry show pyspark # Is PySpark already installed?

poetry-uninstall-pyspark:
poetry remove pyspark

test-connexion-databricks:
databricks-connect test

poetry-install-pyspark:
poetry add "pyspark (>=3.5.5,<=4.1.1)"

poetry-add-pyspark-connect:
poetry add databricks-connect@~17.3 # Or X.Y to match your cluster version.

edpiqual-example:
python -c "from edpiqual.entrypoint import main; main()" \
--ingestion_catalog_name default \
--product_catalog_name workspace \
--data_bundle Transactions --data_bundle_object test1\
--format csv --source_type autoloader --trigger availableNow \
--agreement_version 1 --start_date 2026-03-08T04:01:40.285Z \
--write_mode upsert --upsert_column reference_id


this is my just file 

1 REPLY 1

mderela
Contributor

This error typically happens when there’s a version mismatch between your local pyspark installation and databricks-connect.
PythonUDFEnvironment was introduced in a specific version of the Databricks Connect SDK. If you have a standalone pyspark package installed alongside databricks-connect, it shadows the correct one bundled with the connector.

 

If this is the main issue: try to remove pyspark, remove only databricks-connect. Via poetry, verify if pyspark is not installed (poetry show).