cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to resolve the column name in s3 path saved as UUID format

sher
Valued Contributor II

our managed databricks tables stored in s3 as default, while i am reading that s3 path directly i am getting the column value as UUID

eg: column name ID in databricks table
while checking the S3 Path, the column name looks like COL- b400af61-9tha-4565-89c4-d6ba43f948b7 

how to resolve it, basically i am reading the S3 path directly to generate as external table in Snowflake

Thanks

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @sher, When creating an external table in Snowflake that points to a directory in Amazon S3, you’ll need to follow a specific syntax and consider the file format.

 

Let’s address your issue with the UUID column name.

 

File Format:

  • First, create a file format that defines the type of file (e.g., CSV), the field delimiter (e.g., comma), whether data is enclosed in double quotes, and whether to skip the header.
  • For example:CREATE OR REPLACE FILE FORMAT my_schema.my_format TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY = '"' SKIP_HEADER = 1;

Stage Creation:

  • Next, create an external stage that specifies the S3 details and the file format.
  • Replace <path where file is kept> with the actual S3 path:CREATE OR REPLACE STAGE my_schema.my_stage URL = 's3://<path where file is kept>' CREDENTIALS = (AWS_KEY_ID = '****' AWS_SECRET_KEY = '****') FILE_FORMAT = my_format;

External Table Creation:

  • Finally, create the external table based on the stage name and file format.
  • Define the columns using expressions that extract the actual values from the JSON data.
  • For example:CREATE OR REPLACE EXTERNAL TABLE my_schema.my_table (    ID VARCHAR AS (VALUE:COL- b400af61-9tha-4565-89c4-d6ba43f948b7::VARCHAR),    OtherColumn1 VARCHAR AS (VALUE:OtherColumn1::VARCHAR),    OtherColumn2 INT AS (VALUE:OtherColumn2::INT) ) WITH LOCATION = @My_stage FILE_FORMAT = my_format;
  • Adjust the column names and data types for your specific use case.

This approach should help you resolve the issue with UUID column names when reading directly from the S3 path to generate an external table in Snowflake. 🌟

sher
Valued Contributor II

hi @Kaniz 

Thank you for you are reply but the issue is i am not able to map  ID with COL- b400af61-9tha-4565-89c4-d6ba43f948b7. i use

DESCRIBE TABLE EXTENDED table_name

a query to get the list of UUID column names. and for real column name fetting from information schema, is there any other way to match with appropriate column names?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.