02-03-2023 01:04 PM
I have Jupyter Notebook installed on my machine working normally. I tested running a Spark application by running the spark-submit command and it returns the message that the file was not found. What do you need to do to make it work?
Below is a file with a simple example.
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
if __name__ == "__main__":
spark = SparkSession.builder.appName("Exemplo").getOrCreate()
arqschema = "id INT, nome STRING,status STRING, cidade STRING,vendas INT,data STRING"
despachantes = spark.read.csv("C:\test-spark\despachantes.csv",header=False, schema=arqschema)
calculo = despachantes.select("date").groupBy(year("date")).count()
calculo.write.format("console").save()
spark.stop()
02-06-2023 05:38 PM
I managed to resolve. It was java and python incompatibility with the Spark version I was using.
I will create a video, explaining how to use Spark without Jupyter notebook.
02-05-2023 10:20 AM
Hi, yet this is not tested in my lab, but could you please check and confirm if this works: https://stackoverflow.com/questions/37861469/how-to-submit-spark-application-on-cmd
02-06-2023 05:38 PM
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group