Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-11-2022 12:54 PM
I have data in a Spark Dataframe and I write it to an s3 location. It has some complex datatypes like structs etc. When I create the table on top on the s3 location by using
CREATE TABLE IF NOT EXISTS table_name
USING DELTA
LOCATION 's3://.../...';The table has all null values in it and I am not sure what is going wrong
Labels:
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-11-2022 01:05 PM
@John Constantine ,
- Try to load it as DataFrame (spark.read.delta(path)) and validate what is loading,
- It could be easier to mount the S3 location as a folder to ensure that all data is there (dbutils or %fs to check) and that the connection is working correctly.
- Try also REFRESH [TABLE] table_name,
- Share more code, not sure what is loaded precisely. For example, the delta folder should be loaded, not a particular file,
- There are parts/versions of delta in the delta folder written as a parquet. You can load them separately to DEBUG is all ok.
My blog: https://databrickster.medium.com/