I got a notebook running on DBR 12.2 with the following R code:
install.packages("microbenchmark")
install.packages("furrr")
library(microbenchmark)
library(tidyverse)
# example tibble
df_test <- tibble(id = 1:100000, street_raw = rep("Bahnhofstrasse 12", 100000))
# function
test_fc <- function(str = "") {
value = str_to_lower(str)
value = str_replace_all(value, pattern="n", replacement="N")
value = str_replace_all(value, pattern="h", replacement="H")
return(value) }
# single core with purrr package
microbenchmark(df_test %>% mutate(street_all = map_chr(street_raw, test_fc)), times = 10)
# DBR 12.2 / median 9.300949 seconds
# DBR 13.2 / median 16.04199 seconds
# multi core with furrr package
library(furrr)
plan(multisession)
microbenchmark(df_test %>% mutate(street_all = future_map_chr(street_raw, test_fc)), times = 10)
# DBR 12.2 / median 1.861389 seconds
# DBR 13.2 / median 2.781327 seconds
with a cluster (Standard_F8s, 16GB RAM, 8 Core) DBR 12.2 i got the result in:
single core 9.30s / multi core 1.86s
with the same cluster and DBR 13.2 i got the result in:
single core 16.04s / multi core 2.78s
Can anyone give me some advise to speed up also with DBR 13+ or is it in general slower?