Spark Error : RScript (1243) terminated unexpectedly: Cannot call r___RBuffer__initialize().

chandan_a_v
Valued Contributor

grid_slice %>%

 sdf_copy_to(

  sc = sc,

  name = "grid_slice",

  overwrite = TRUE

 ) %>%

 sdf_repartition(

  partitions = min(n_executors * 3, NROW(grid_slice)),

  partition_by = "variable"

 ) %>%

 spark_apply(

  f = slice_data_wrapper,

  columns = c(

   variable = "character",

   max_slice = "integer",

   n_slices = "integer"

  ),

  context = list(

   metadata = metadata,

   s3_params = s3_params,

   subfolder = "target_data/orig"

  )

 ) %>%

 compute() %>%

 collect() 

I got below attached issue when I tried to execute the above code. grid_slice is a tibble.

Tried different version of arrow (1.x, 4.x, 6.x) but didn't work.

ERROR sparklyr: RScript (1243) terminated unexpectedly: Cannot call r___RBuffer__initialize(). See https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow C++ libraries.

rajatha_d
New Contributor II

Hi @Kaniz Fatma​ ,

I am also facing a similar issue, could you please provide the solution ASAP?

Thanks

chandan_a_v
Valued Contributor

Hi @Kaniz Fatma

Did you find any solution? Please let us know

Hi @Kaniz Fatma​ ,

Currently I am using R 3.6.2, I will upgrade it to 4.x and let you know. Thanks for the input.

Hi @Kaniz Fatma​ ,

 With R version 4.1.2, Spark version 2.4.5 and arrow version 5.0.0. The issue got fixed.

View solution in original post