From the Clusters tab, select a cluster and view the Spark UI.
The Environment tab shows the current Spark configuration settings.
Here is an exhaustive list of the Spark Config params: https://spark.apache.org/docs/latest/configuration.html
The
SparkContext
is provided for you within the notebook UI, therefore you cannot change these values within your notebook code. Once
SparkConf
is passed to the
SparkContext
constructor, the values are cloned and cannot be changed. This is a Spark limitation.
One thing to note is that Databricks has already tuned Spark for the most common workloads running on the specific EC2 instance types used within Databricks Cloud.
In other words, you shouldn't have to changes these default values except in extreme cases. To change these defaults, please contact Databricks Cloud support.
If you're working with the SqlContext or HiveContext, you can manually set configuration properties using HiveQL's
SET key=value
command with
spark.sql.*
properties from this list, for example:
https://spark.apache.org/docs/latest/sql-programming-guide.html#configuration.
However, overriding these configuration values may cause problems for other users of the cluster.