Databricks Community

horatio · ‎06-16-2022

I can read all of my s3 data without any issues after configuring my cluster with an instance profile however when I try to run the following dlt decorator it gives me an access denied error. Are there some other IAM tweaks I need to make for delta? When looking at the pipeline, it looks like it fails at setting up tables in s3 after the initial read. Note that I also tried to set my storage location to a path in s3 both with s3a:// and /mnt syntax with no luck either. I also noticed that if I set storage to my bucket it hangs on waiting for resources before failing with `DataPlaneException: Failed to start the DLT service on cluster`. Ultimately I would use this with autoloader and cloudFiles but this is a simplified test which should work anyway -- thanks

#this gives me a 403 java.nio.file.AccessDeniedException to the s3 location
import dlt
from pyspark.sql.functions import explode, col
@dlt.table
def rtb_dlt_bids_bronze():
    return (
        spark.read.format("json")
        .option("multiLine", "true")
        .option("inferSchema", "true")
        .load(/mnt/demo/<pathtofile>))

on the other hand this works fine:

display(spark.read.format("json")
         .option("multiLine", "true")
         .option("inferSchema", "true")
         .load("/mnt/demo/<pathtofile>"))

raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling o772.load.
: java.nio.file.AccessDeniedException: s3a://<pathtofile>: getFileStatus on s3a://<pathtofile>: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden; request: HEAD https://<pathtofile>; {} Hadoop 3.3.1, aws-sdk-java/1.12.189 Linux/5.4.0-1075-aws OpenJDK_64-Bit_Server_VM/25.302-b08 java/1.8.0_302 scala/2.12.14 vendor/Azul_Systems,_Inc. cfg/retry-mode/legacy com.amazonaws.services.s3.model.GetObjectMetadataRequest

jose_gonzalez · ‎07-29-2022

how do you do your mount point? could you share more details please

Vidula · ‎08-24-2022

Hi @Robby Kiskanyan

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.

We'd love to hear from you.

Thanks!