01-07-2022 05:21 AM
grid_slice %>%
sdf_copy_to(
sc = sc,
name = "grid_slice",
overwrite = TRUE
) %>%
sdf_repartition(
partitions = min(n_executors * 3, NROW(grid_slice)),
partition_by = "variable"
) %>%
spark_apply(
f = slice_data_wrapper,
columns = c(
variable = "character",
max_slice = "integer",
n_slices = "integer"
),
context = list(
metadata = metadata,
s3_params = s3_params,
subfolder = "target_data/orig"
)
) %>%
compute() %>%
collect()
I got below attached issue when I tried to execute the above code. grid_slice is a tibble.
Tried different version of arrow (1.x, 4.x, 6.x) but didn't work.
ERROR sparklyr: RScript (1243) terminated unexpectedly: Cannot call r___RBuffer__initialize(). See https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow C++ libraries.
01-11-2022 12:48 AM
Hi @Kaniz Fatma ,
With R version 4.1.2, Spark version 2.4.5 and arrow version 5.0.0. The issue got fixed.
01-10-2022 12:10 AM
Hi @Kaniz Fatma ,
I am also facing a similar issue, could you please provide the solution ASAP?
Thanks
01-10-2022 06:12 AM
Hi @Kaniz Fatma
Did you find any solution? Please let us know
01-10-2022 06:31 AM
Hi @Kaniz Fatma ,
Currently I am using R 3.6.2, I will upgrade it to 4.x and let you know. Thanks for the input.
01-11-2022 12:48 AM
Hi @Kaniz Fatma ,
With R version 4.1.2, Spark version 2.4.5 and arrow version 5.0.0. The issue got fixed.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now