grid_slice %>%
sdf_copy_to(
sc = sc,
name = "grid_slice",
overwrite = TRUE
) %>%
sdf_repartition(
partitions = min(n_executors * 3, NROW(grid_slice)),
partition_by = "variable"
) %>%
spark_apply(
f = slice_data_wrapper,
columns = c(
variable = "character",
max_slice = "integer",
n_slices = "integer"
),
context = list(
metadata = metadata,
s3_params = s3_params,
subfolder = "target_data/orig"
)
) %>%
compute() %>%
collect()
I got below attached issue when I tried to execute the above code. grid_slice is a tibble.
Tried different version of arrow (1.x, 4.x, 6.x) but didn't work.
ERROR sparklyr: RScript (1243) terminated unexpectedly: Cannot call r___RBuffer__initialize(). See https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow C++ libraries.