This also has a lot of overhead, it creates a spark dataframe, distributing the data just to pull it back for display. I really don't understand why databricks does not simply allow plotting pandas dataframes locally by calling display().