Spark Error when running python script on databricks

170017 — Thu, 07 Jul 2022 11:15:40 GMT

I have the following basic script that works fine using pycharm on my machine.

from pyspark.sql import SparkSession

print("START")

spark = SparkSession \

.Builder() \

.appName("myapp") \

.master('local[*, 4]') \

.getOrCreate()

print(spark)

data = [('James', '', 'Smith', '1991-04-01', 'M', 3000),

('Michael', 'Rose', '', '2000-05-19', 'M', 4000),

('Robert', '', 'Williams', '1978-09-05', 'M', 4000),

('Maria', 'Anne', 'Jones', '1967-12-01', 'F', 4000),

('Jen', 'Mary', 'Brown', '1980-02-17', 'F', -1)

]

columns = ["firstname", "middlename", "lastname", "dob", "gender", "salary"]

df = spark.createDataFrame(data=data, schema=columns)

print(df)

However when trying to run on a databricks cluster, directly through python script it gives an error.

START Traceback (most recent call last): File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/Workspace/Repos/***********/sdk_test/tests/snippets/spark_tests.py", line 13, in class SparkTests: File "/Workspace/Repos/*******/sdk_test/tests/snippets/spark_tests.py", line 16, in SparkTests sc = SparkContext.getOrCreate() File "/databricks/spark/python/pyspark/context.py", line 400, in getOrCreate SparkContext(conf=conf or SparkConf()) File "/databricks/spark/python/pyspark/context.py", line 147, in init self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer, File "/databricks/spark/python/pyspark/context.py", line 192, in _do_init raise RuntimeError("A master URL must be set in your configuration") RuntimeError: A master URL must be set in your configuration CalledProcessError: Command 'b'cd ../\n\n/databricks/python3/bin/python -m tests.snippets.spark_tests\n# python -m tests.runner --env=qa --runtime_env=databricks --upload=True --package=sdk\n'' returned non-zero exit status 1.

What am I missing?

Re: Spark Error when running python script on databricks

Vidula — Thu, 01 Sep 2022 08:19:35 GMT

Hi @Patricia Mayer

Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.

Thanks!

topic Spark Error when running python script on databricks in Data Engineering

Spark Error when running python script on databricks

Re: Spark Error when running python script on databricks