cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How can I view and change the SparkConf settings if the SparkContext (sc) is already provided for me?

cfregly
Contributor
5 REPLIES 5

cfregly
Contributor

From the Clusters tab, select a cluster and view the Spark UI.

The Environment tab shows the current Spark configuration settings.

Here is an exhaustive list of the Spark Config params: https://spark.apache.org/docs/latest/configuration.html

The

SparkContext
is provided for you within the notebook UI, therefore you cannot change these values within your notebook code. Once
SparkConf
is passed to the
SparkContext
constructor, the values are cloned and cannot be changed. This is a Spark limitation.

One thing to note is that Databricks has already tuned Spark for the most common workloads running on the specific EC2 instance types used within Databricks Cloud.

In other words, you shouldn't have to changes these default values except in extreme cases. To change these defaults, please contact Databricks Cloud support.

If you're working with the SqlContext or HiveContext, you can manually set configuration properties using HiveQL's

SET key=value
command with
spark.sql.*
properties from this list, for example: https://spark.apache.org/docs/latest/sql-programming-guide.html#configuration.

However, overriding these configuration values may cause problems for other users of the cluster.

JonathanSpooner
New Contributor II

hi, may I know how did you handle the config for elasticsearch? I also have to stream data to elasticsearch.

There is a 'spark' tab in the cluster creation page, you can add the configs there before starting the cluster.

MatthewValenti
New Contributor II

This is an old post, however, is this still accurate for the latest version of Databricks in 2019? If so, how to approach the following?

1. Connect to many MongoDBs.

2. Connect to MongoDB when connection string information is dynamic (i.e. stored in spark table).

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group