How can I use display() in a python notebook with pyspark.sql.Row Objects, e.g. after calling the first() operation on a DataFrame?

Anonymous
Not applicable
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2015 09:24 AM
I'm trying to
display()
the results from calling first()
on a DataFrame, but display()
doesn't work with pyspark.sql.Row
objects. How can I display this result?
Labels:
- Labels:
-
Data-frames
-
Display
-
SQL
2 REPLIES 2

Anonymous
Not applicable
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2015 09:28 AM
The
display()
function requires a collection as opposed to single item, so any of the following examples will give you a means to displaying the results:
- `display([df.first()])` # just make it an array
# take w/ 1 is functionally equivalent to first(), but returns a DataFramedisplay(df.take(1))
# I'm not 100% sure this is guaranteed to be ordered, but it's another option to try that returns a DataFramedisplay(df.limit(1))
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-18-2015 03:22 PM
Use take()

