02-03-2023 01:04 PM
I have Jupyter Notebook installed on my machine working normally. I tested running a Spark application by running the spark-submit command and it returns the message that the file was not found. What do you need to do to make it work?
Below is a file with a simple example.
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
if __name__ == "__main__":
spark = SparkSession.builder.appName("Exemplo").getOrCreate()
arqschema = "id INT, nome STRING,status STRING, cidade STRING,vendas INT,data STRING"
despachantes = spark.read.csv("C:\test-spark\despachantes.csv",header=False, schema=arqschema)
calculo = despachantes.select("date").groupBy(year("date")).count()
calculo.write.format("console").save()
spark.stop()
02-06-2023 05:38 PM
I managed to resolve. It was java and python incompatibility with the Spark version I was using.
I will create a video, explaining how to use Spark without Jupyter notebook.
02-05-2023 10:20 AM
Hi, yet this is not tested in my lab, but could you please check and confirm if this works: https://stackoverflow.com/questions/37861469/how-to-submit-spark-application-on-cmd
02-06-2023 05:38 PM
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.