Refreshing DELTA external table
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-03-2024 09:17 AM
I'm having trouble with the REFRESH TABLE command - does it work with DELTA external tables? I'm doing the following steps:
- Create table: CREATE TABLE IF NOT EXISTS `catalog`.`default`.`table_name` (
KEY DOUBLE
, CUSTKEY DOUBLE
, STATUS STRING
, PRICE DOUBLE
, DATE TIMESTAMP
, PRIORITY STRING
)
USING PARQUET LOCATION 's3://bucket-name/folder-name/'; - Convert to Delta: convert to delta `catalog`.`default`.`table_name`;
- Add new parquet file to s3 folder
- I tried REFRESH TABLE, then CONVERT TO DELTA again, which didn't work. I also tried converting to delta first and then REFRESH TABLE, and I can't get the new file recognized.
I can't get the new file to show up in the created delta table without dropping and recreating the external table - is REFRESH TABLE supposed to work for DELTA external tables? Is there another order of operations I need to do to get the new file to be recognized in the existing delta external table?
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-07-2024 05:52 AM
Step 3: Insert the Data; don't add it directly to the S3 folder.
Once it's converted to Delta, it maintains the transaction log. Inserting a Parquet file (followed by another convert /refresh) won't work, as the rest of the dataset is already Delta.
~

