Hi Databricks community,
Hope you are doing well.
I am trying to create an external table using a Gzipped CSV file uploaded to an S3 bucket.
The S3 URI of the resource doesn't have any file extensions, but the content of the file is a Gzipped comma separated file that I want to read into the External Table.
The command I'm using is:
CREATE EXTERNAL TABLE `mycatalog`.`myExternalTable`(
`ID` STRING,
`value` STRING
)
USING CSV
OPTIONS (
PATH 's3://mybucket/filename',
HEADER 'false',
encoding 'UTF-8',
compression 'gzip',
delimiter ','
);
If I try to create the table using that exact same file, in the same bucket, with the .gz extension, it works.
But without that extension, it gives me a weird jumbled output(on doing select * on the table) indicating that decompression is not happening properly.
Is there a way to create the table without adding any extensions to the S3 file path?
Thanks for your time,
Aditya