Databricks

NCat · ‎08-08-2023

Hi community,

How can I start SparkSession out of Notebook?
I want to split my Notebook into small Python modules, and I want to let some of them to call Spark functionality.

-werners- · ‎08-09-2023

can you elaborate a bit more?
Are you going to call those modules in a notebook, and want to use spark functions in them?
Or do you want to explicitly start a separate sparksession for each module?

sakhulaz · ‎08-09-2023

Hello,

To start a SparkSession outside of a notebook, you can follow these steps to split your code into small Python modules and utilize Spark functionality:

Import Required Libraries: In your Python module, import the necessary libraries for Spark:

In your Python module, import the necessary libraries for Spark:

from pyspark.sql import SparkSession

Create SparkSession:

Initialize the SparkSession at the beginning of your module:

spark = SparkSession.builder \
    .appName("YourAppName") \
    .config("spark.some.config.option", "config-value") \
    .getOrCreate()

Customize the configuration options as needed.

Sam's Club

Tharun-Kumar · ‎08-09-2023

@NCat

Databricks provides Spark Session out of the box. You have to just use the variable "spark".

In order to use it in other modules, you have to pass the spark variable as a parameter to the other modules.

NCat · ‎08-09-2023

Thank you for all replies.
@-werners- I want to use Spark Session in modules which is called from Notebook.

@sakhulaz How can I get the config options to attach to the Databricks data?

@Tharun-Kumar Thank you. That approach definitely works for my situation!

-werners- · ‎08-09-2023

in general (as already stated) a notebook automatically gets a sparksession.
You don't have to do anything.
If you specifically need to have separate sessions (isolation), you should run different notebooks (or plan different jobs) as these get a new session (a session per notebook/job).
Calling magic functions like %scala, %run etc use the same sparksession, so no isolation there.

Databricks

How can I start SparkSession out of Notebook?

Databricks Community Social - May 2024

🔔 Attention Databricks Academy Users: SSO Implementation Incoming! Secure Your Account Today!

Announcing the General Availability of Databricks Asset Bundles

How to successfully build GenAI applications

Meet DBRX, the New Standard for High-Quality LLMs