Databricks

weldermartins · ‎02-03-2023

I have Jupyter Notebook installed on my machine working normally. I tested running a Spark application by running the spark-submit command and it returns the message that the file was not found. What do you need to do to make it work?

Below is a file with a simple example.

from pyspark.sql import SparkSession
from pyspark.sql.functions import *
 
if __name__ == "__main__":
  spark = SparkSession.builder.appName("Exemplo").getOrCreate()
 
 arqschema = "id INT, nome STRING,status STRING, cidade STRING,vendas INT,data STRING"
 
 despachantes = spark.read.csv("C:\test-spark\despachantes.csv",header=False, schema=arqschema)
 
  calculo = despachantes.select("date").groupBy(year("date")).count()
 
  calculo.write.format("console").save()
 
 
 
  spark.stop()