cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Enable to use library GraphFrames

sachamourier
Contributor

Hello,

I am trying to install and use the library GraphFrames but keep receiving the following error: "AttributeError: 'SparkSession' object has no attribute '_sc'".

I have tried to install the library on my all-purpose cluster (Access mode: Standard). The installation works, but the code not. I am using the library version "graphframes:graphframes:0.8.4-spark3.5-s_2.13" and my spark version is 3.5.2.
I have also tried to install the library via pip install but no success either.

Does anyone know how to make it work ? I would like to avoid having to change my cluster's access mode.

Thanks a lot,

Sacha

1 ACCEPTED SOLUTION

Accepted Solutions

And yes, I can confirm that it works in dedicated access mode. I've used following code:

 

pip install graphframes-py

from functools import reduce
from pyspark.sql import functions as F
from graphframes import GraphFrame

nodes = [
    (1, "Alice", 30),
    (2, "Bob", 25),
    (3, "Charlie", 35)
]
nodes_df = spark.createDataFrame(nodes, ["id", "name", "age"])

edges = [
    (1, 2, "friend"),
    (2, 1, "friend"),
    (2, 3, "friend"),
    (3, 2, "enemy")  # eek!
]
edges_df = spark.createDataFrame(edges, ["src", "dst", "relationship"])

g = GraphFrame(nodes_df, edges_df)

 

 And as you can see it works as expected:

szymon_dybczak_0-1753719964171.png


One thing to remember, Python distribution does not include JVM-core. So I had to install also this version of library on my cluster : graphframes:graphframes:0.8.3-spark3.5-s_2.13

szymon_dybczak_1-1753720000895.png 

szymon_dybczak_2-1753720046159.png

 

 

View solution in original post

5 REPLIES 5

szymon_dybczak
Esteemed Contributor III

Hi @sachamourier ,

Maybe try to use Databricks Runtime ML which already includes an optimized installation of GraphFrames?

How to use GraphFrames on Azure Databricks - Azure Databricks | Microsoft Learn

Hi @sachamourier ,

But if you don't want to use different runtime then you need to change access mode. In standard access mode you don't have access to SparkContext which this library requires. Hence you're getting a an error like "'SparkSession' object has no attribute '_sc'" (where _sc refers to SparkContext).

szymon_dybczak_1-1753719122338.png

 

And yes, I can confirm that it works in dedicated access mode. I've used following code:

 

pip install graphframes-py

from functools import reduce
from pyspark.sql import functions as F
from graphframes import GraphFrame

nodes = [
    (1, "Alice", 30),
    (2, "Bob", 25),
    (3, "Charlie", 35)
]
nodes_df = spark.createDataFrame(nodes, ["id", "name", "age"])

edges = [
    (1, 2, "friend"),
    (2, 1, "friend"),
    (2, 3, "friend"),
    (3, 2, "enemy")  # eek!
]
edges_df = spark.createDataFrame(edges, ["src", "dst", "relationship"])

g = GraphFrame(nodes_df, edges_df)

 

 And as you can see it works as expected:

szymon_dybczak_0-1753719964171.png


One thing to remember, Python distribution does not include JVM-core. So I had to install also this version of library on my cluster : graphframes:graphframes:0.8.3-spark3.5-s_2.13

szymon_dybczak_1-1753720000895.png 

szymon_dybczak_2-1753720046159.png

 

 

sachamourier
Contributor

@szymon_dybczak Thanks for the responses. I indeed changed my all-purpose cluster access mode and it worked. I figured that was a nicest option than changing the runtime.

szymon_dybczak
Esteemed Contributor III

Cool, great that it worked for you!

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now