Hi,
I am trying to read a csv file into a Spark DataFrame using sparklyr::spark_read_csv. I am receiving a 403 access denied error.
I have stored my AWS credentials as environment variables, and can successfully read the file as an R dataframe using arrow::read_csv_arrow. However, spark_read_csv is failing.
I have confirmed that I am connected to spark, and can read parquet files stored elsewhere. Any advice? Thanks,
my_file <- glue::glue("s3://my-bucket/my-folder/my-file-name.csv")
## This works
mydata <- arrow::read_csv_arrow(
file = my_file
)
## This doesn't
mydata <- sparklyr::spark_read_csv(
sc,
name = "mydata"
file = my_file
)
# Error message
Error : java.nio.file.AccessDeniedException
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden; request