cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Does anyone have a single example of a graphframe with two+ types of vertices? (e.g. user and post, not user to user)

jonathan-dufaul
Valued Contributor

I have gone through about 75 pages and every single example has only relationships from one type of object to the same type of object. about 90% have the exact same example of "Alice Bob" "friends."

Has anyone ever made a graphframe with two types of objects. I'm fairly convinced the answer is no based off my frustrating-as-heck searching.

edit this is the example:

vertices = sqlContext.createDataFrame([

("a", "Alice", 34),

("b", "Bob", 36),

("c", "Charlie", 30),

("d", "David", 29),

("e", "Esther", 32),

("e1", "Esther2", 32),

("f", "*****", 36),

("g", "Gabby", 60),

("h", "Mark", 61),

("i", "Gunter", 62),

("j", "Marit", 63)], ["id", "name", "age"])

edges = sqlContext.createDataFrame([

("a", "b", "friend"),

("b", "a", "follow"),

("c", "a", "follow"),

("c", "f", "follow"),

("g", "h", "follow"),

("h", "i", "friend"),

("h", "j", "friend"),

("j", "h", "friend"),

("e", "e1", "friend")

], ["src", "dst", "relationship"])

g = GraphFrame(vertices, edges)

PlotGraph(g.edges)

I'm convinced nobody knows how to use graphframes beyond this stupid trivial example and this is super frustrating.

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

I feel your pain,

I once tried to use graphframes to flatten a complex tree, ended up using graphX (which is even worse to use but at least it is more flexible).

So maybe take a look at graphX? Beware, it is terrible to use.

I wonder what happened to making cypher (or tinkerpop or whatever) available on spark/databricks.

That being said:

Maybe you can try to put the type of object into your edges. The vertices are typeless and contain only values. The edges have a different value depending on what vertices are connected.

View solution in original post

2 REPLIES 2

-werners-
Esteemed Contributor III

I feel your pain,

I once tried to use graphframes to flatten a complex tree, ended up using graphX (which is even worse to use but at least it is more flexible).

So maybe take a look at graphX? Beware, it is terrible to use.

I wonder what happened to making cypher (or tinkerpop or whatever) available on spark/databricks.

That being said:

Maybe you can try to put the type of object into your edges. The vertices are typeless and contain only values. The edges have a different value depending on what vertices are connected.

I think I finally found a page that explicitly said, "no, graphframes cannot handle different types of vertices," which seems to defeat like 90% of why you would want to use a graph database/only be relevant to the most simple graph database problems.

They blamed dataframes for being so restrictive as a reason for the limitation, but, like, they chose to make this based on dataframes.

It's like group theory when I need cosets rings and ideals. I'd be ashamed publicly releasing this as and parading it as a viable product.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group