cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to resolve the column name in s3 path saved as UUID format

sher
Valued Contributor II

our managed databricks tables stored in s3 as default, while i am reading that s3 path directly i am getting the column value as UUID

eg: column name ID in databricks table
while checking the S3 Path, the column name looks like COL- b400af61-9tha-4565-89c4-d6ba43f948b7 

how to resolve it, basically i am reading the S3 path directly to generate as external table in Snowflake

Thanks

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @sher, When creating an external table in Snowflake that points to a directory in Amazon S3, youโ€™ll need to follow a specific syntax and consider the file format.

 

Letโ€™s address your issue with the UUID column name.

 

File Format:

  • First, create a file format that defines the type of file (e.g., CSV), the field delimiter (e.g., comma), whether data is enclosed in double quotes, and whether to skip the header.
  • For example:CREATE OR REPLACE FILE FORMAT my_schema.my_format TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY = '"' SKIP_HEADER = 1;

Stage Creation:

  • Next, create an external stage that specifies the S3 details and the file format.
  • Replace <path where file is kept> with the actual S3 path:CREATE OR REPLACE STAGE my_schema.my_stage URL = 's3://<path where file is kept>' CREDENTIALS = (AWS_KEY_ID = '****' AWS_SECRET_KEY = '****') FILE_FORMAT = my_format;

External Table Creation:

  • Finally, create the external table based on the stage name and file format.
  • Define the columns using expressions that extract the actual values from the JSON data.
  • For example:CREATE OR REPLACE EXTERNAL TABLE my_schema.my_table (    ID VARCHAR AS (VALUE:COL- b400af61-9tha-4565-89c4-d6ba43f948b7::VARCHAR),    OtherColumn1 VARCHAR AS (VALUE:OtherColumn1::VARCHAR),    OtherColumn2 INT AS (VALUE:OtherColumn2::INT) ) WITH LOCATION = @My_stage FILE_FORMAT = my_format;
  • Adjust the column names and data types for your specific use case.

This approach should help you resolve the issue with UUID column names when reading directly from the S3 path to generate an external table in Snowflake. ๐ŸŒŸ

sher
Valued Contributor II

hi @Kaniz_Fatma 

Thank you for you are reply but the issue is i am not able to map  ID with COL- b400af61-9tha-4565-89c4-d6ba43f948b7. i use

DESCRIBE TABLE EXTENDED table_name

a query to get the list of UUID column names. and for real column name fetting from information schema, is there any other way to match with appropriate column names?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group