cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

acsmaggart
by New Contributor III
  • 4927 Views
  • 6 replies
  • 3 kudos

`collect()`ing Large Datasets in R

Background: I'm working on a pilot project to assess the pros and cons of using DataBricks to train models using R. I am using a dataset that occupies about 5.7GB of memory when loaded into a pandas dataframe. The data are stored in a delta table in ...

collecting the data using pyspark collecting the data using R
  • 4927 Views
  • 6 replies
  • 3 kudos
Latest Reply
Annapurna_Hiriy
Databricks Employee
  • 3 kudos

@acsmaggart Please try using collect_larger() to collect the larger dataset. This should work. Please refer to the following document for more info on the library.https://medium.com/@NotZacDavies/collecting-large-results-with-sparklyr-8256a0370ec6

  • 3 kudos
5 More Replies
mbaumga
by New Contributor III
  • 3414 Views
  • 7 replies
  • 9 kudos

How to request the addition of pre-installed R packages on the clusters?

Today, many R packages are pre-installed on the standard clusters on Databricks. Libraries like "tidyverse", "ggplot2", etc are there. Also the great library "readxl" to load Excel files. But unfortunately, its counterpart "writexl" is not pre-instal...

  • 3414 Views
  • 7 replies
  • 9 kudos
Latest Reply
wicckkjoe
New Contributor II
  • 9 kudos

I just need to figure who decides which R packages are pre-installed on the cluster.

  • 9 kudos
6 More Replies
fsimoes
by New Contributor II
  • 2423 Views
  • 2 replies
  • 1 kudos

Resolved! Docker image with libraries + MLFlow Experiments

Hi everybody,I have a scenario where we have multiple teams working with Python and R, and this teams uses a lot of different libraries. Because of this dozen of libraries, the cluster start took much time. Then I created a Docker image, where I can ...

  • 2423 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Fabio Simoes​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 1 kudos
1 More Replies
yopbibo
by Contributor II
  • 1234 Views
  • 1 replies
  • 0 kudos

Sending R functions to worker nodes

Hi!If I need to use many workers to distributes regular pandas, I would use a pandas_UDF. (having regular python crunching a slice of my data, on each node, and combining all results back to the driver node)Is there something equivalent for R?Thanks,

  • 1234 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hi!If I need to use many workers to distributes regular pandas, I would use a pandas_UDF. (having regular python crunching a slice of my data, on each node, and combining all results back to the driver node)Is there something equivalent for R?Thanks,

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
User16830818524
by New Contributor II
  • 7131 Views
  • 2 replies
  • 1 kudos

Resolved! Host R Shiny App

Can I host a R Shiny App on a Databricks cluster?

  • 7131 Views
  • 2 replies
  • 1 kudos
Latest Reply
mgiglia
Contributor
  • 1 kudos

I’ll be asking my rep about the hosted RShiny server in private preview— our team didn’t know about that so we’ve struggled through putting our shiny app (developed on Databricks using RStudio, that part was fantastic) into a container and hosting it...

  • 1 kudos
1 More Replies
User16753724663
by Valued Contributor
  • 3488 Views
  • 1 replies
  • 1 kudos

Unable to install sf and rgeos R packages on the cluster

Got following errorjava.lang.RuntimeException: Installation failed with message:Error installing R package: Could not install package with error: installation of package ‘rgdal’ had non-zero exit status   Full error log available at /databricks/drive...

  • 3488 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16753724663
Valued Contributor
  • 1 kudos

We can use the below init script to install the packages in the cluster:%scala   dbutils.fs.put("dbfs:/databricks/init_scripts/rlib.sh", """   #!/bin/bash   sudo apt-get install -y libudunits2-dev   sudo add-apt-repository ppa:ubuntugis/ubuntugis-uns...

  • 1 kudos
Labels