Databricks Community

sachamourier · ‎07-28-2025

Hello,

I am trying to install and use the library GraphFrames but keep receiving the following error: "AttributeError: 'SparkSession' object has no attribute '_sc'".

I have tried to install the library on my all-purpose cluster (Access mode: Standard). The installation works, but the code not. I am using the library version "graphframes:graphframes:0.8.4-spark3.5-s_2.13" and my spark version is 3.5.2.
I have also tried to install the library via pip install but no success either.

Does anyone know how to make it work ? I would like to avoid having to change my cluster's access mode.

Thanks a lot,

Sacha

szymon_dybczak · ‎07-28-2025

And yes, I can confirm that it works in dedicated access mode. I've used following code:

pip install graphframes-py

from functools import reduce
from pyspark.sql import functions as F
from graphframes import GraphFrame

nodes = [
    (1, "Alice", 30),
    (2, "Bob", 25),
    (3, "Charlie", 35)
]
nodes_df = spark.createDataFrame(nodes, ["id", "name", "age"])

edges = [
    (1, 2, "friend"),
    (2, 1, "friend"),
    (2, 3, "friend"),
    (3, 2, "enemy")  # eek!
]
edges_df = spark.createDataFrame(edges, ["src", "dst", "relationship"])

g = GraphFrame(nodes_df, edges_df)

And as you can see it works as expected:

One thing to remember, Python distribution does not include JVM-core. So I had to install also this version of library on my cluster : graphframes:graphframes:0.8.3-spark3.5-s_2.13

View solution in original post

szymon_dybczak · ‎07-28-2025

Hi @sachamourier ,

Maybe try to use Databricks Runtime ML which already includes an optimized installation of GraphFrames?

How to use GraphFrames on Azure Databricks - Azure Databricks | Microsoft Learn

szymon_dybczak · ‎07-28-2025

Hi @sachamourier ,

But if you don't want to use different runtime then you need to change access mode. In standard access mode you don't have access to SparkContext which this library requires. Hence you're getting a an error like "'SparkSession' object has no attribute '_sc'" (where _sc refers to SparkContext).

szymon_dybczak · ‎07-28-2025

And yes, I can confirm that it works in dedicated access mode. I've used following code:

pip install graphframes-py

from functools import reduce
from pyspark.sql import functions as F
from graphframes import GraphFrame

nodes = [
    (1, "Alice", 30),
    (2, "Bob", 25),
    (3, "Charlie", 35)
]
nodes_df = spark.createDataFrame(nodes, ["id", "name", "age"])

edges = [
    (1, 2, "friend"),
    (2, 1, "friend"),
    (2, 3, "friend"),
    (3, 2, "enemy")  # eek!
]
edges_df = spark.createDataFrame(edges, ["src", "dst", "relationship"])

g = GraphFrame(nodes_df, edges_df)

And as you can see it works as expected:

One thing to remember, Python distribution does not include JVM-core. So I had to install also this version of library on my cluster : graphframes:graphframes:0.8.3-spark3.5-s_2.13

sachamourier · ‎07-29-2025

@szymon_dybczak Thanks for the responses. I indeed changed my all-purpose cluster access mode and it worked. I figured that was a nicest option than changing the runtime.