Difference between running pyspark code by using commend python3 and pyspark

twotwoiscute — Sat, 17 Jul 2021 03:50:29 GMT

I am confused by what's difference between running code using command

python3 CODENAME.py

and launch it by commend

pyspark

and start working on the code.

When I run the code :

spark = SparkSession.builder.config("spark.driver.memory", "16").appName("EDA").getOrCreate()

The first way

python3 CODENAME.py

raises the error even if I have already done

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export SPARK_HOME=/home/twotwo/anaconda3/envs/yolov5/lib/python3.8/site-packages/pyspark
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.9-src.zip:$PYTHONPATH
export PATH=$SPARK_HOME/python:$PATH

the error mseeage look like :

Exception: Java gateway process exited before sending its port number

However ,the second way runs the code without any problem , I would like to know what's the difference between these two ways.Thanks

topic Difference between running pyspark code by using commend python3 and pyspark in Data Engineering

Difference between running pyspark code by using commend python3 and pyspark