cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

acsmaggart
by New Contributor III
  • 2917 Views
  • 6 replies
  • 2 kudos

`collect()`ing Large Datasets in R

Background: I'm working on a pilot project to assess the pros and cons of using DataBricks to train models using R. I am using a dataset that occupies about 5.7GB of memory when loaded into a pandas dataframe. The data are stored in a delta table in ...

collecting the data using pyspark collecting the data using R
  • 2917 Views
  • 6 replies
  • 2 kudos
Latest Reply
Annapurna_Hiriy
New Contributor III
  • 2 kudos

@acsmaggart Please try using collect_larger() to collect the larger dataset. This should work. Please refer to the following document for more info on the library.https://medium.com/@NotZacDavies/collecting-large-results-with-sparklyr-8256a0370ec6

  • 2 kudos
5 More Replies
Labels