Anonymous
Not applicable
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-31-2022 01:52 PM
If you have 5GB of data, you don't need spark. Just use your laptop. Spark is for scale and won't out perform well on small data sets because of all the overhead distributed requires.
Also, don't name a pandas dataframe df_spark_. Just name it something_pdf.