cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

CHAR/VARCHAR fields sometimes show as STRING in a view

kmenke-em
New Contributor II

We've found an interesting behavior where `char` and `varchar` fields in a table show as the `string` type in a view. Consider the following table and view:

create or replace table thirty_day_tables.kit_varchar_string (
  str1 string,
  str2 char(10),
  str3 varchar(10)
);

create or replace view thirty_day_tables.kit_varchar_string_vw as 
select str1, str2, str3, cast("dummy" as varchar(10)) as str4 
from thirty_day_tables.kit_varchar_string;

In this case, it mostly works as I expected except for the manufactured field `str4`. The output for `

describe thirty_day_tables.kit_varchar_string_vw`:
col_namedata_typecomment
str1stringnull
str2char(10)null
str3varchar(10)null
str4stringnull

Why does str4 show as a string despite the explicit cast to a varchar(10)?

Even stranger, if I add a comment to each column I get a different result:

create or replace view thirty_day_tables.kit_varchar_string_vw (
  str1 comment "OK - expected to be a string in the view",
  str2 comment "ERROR - expected to be a char(10) in the view",
  str3 comment "ERROR - expected to be a varchar(10) in the view",
  str4 comment "ERROR - expected to be a varchar(10) in the view"
 ) as 
select str1, str2, str3, cast("dummy" as varchar(10)) as str4 
from thirty_day_tables.kit_varchar_string;

The output for `describe thirty_day_tables.kit_varchar_string_vw`:

col_namedata_typecomment
str1stringOK - expected to be a string in the view
str2stringERROR - expected to be a char(10) in the view
str3stringERROR - expected to be a varchar(10) in the view
str4stringERROR - expected to be a varchar(10) in the view

Why do all the columns show as string???

 

1 ACCEPTED SOLUTION

Accepted Solutions

UmaMahesh1
Honored Contributor III

In Spark SQL

  • string is the canonical type for all textual data.
  • char(n) and varchar(n) are parsed and stored as metadata, but internally treated as string.
  • When you create a view, Spark does not preserve the original char(n) or varchar(n) types — it normalizes everything to string.

This is why:

  • In your first view (without comments), Spark is able to preserve the metadata from the underlying table for str2 and str3.
  • But when you add column comments, Spark reconstructs the schema from the SELECT clause and drops the original metadata, defaulting all string-like types to string.
Uma Mahesh D

View solution in original post

1 REPLY 1

UmaMahesh1
Honored Contributor III

In Spark SQL

  • string is the canonical type for all textual data.
  • char(n) and varchar(n) are parsed and stored as metadata, but internally treated as string.
  • When you create a view, Spark does not preserve the original char(n) or varchar(n) types — it normalizes everything to string.

This is why:

  • In your first view (without comments), Spark is able to preserve the metadata from the underlying table for str2 and str3.
  • But when you add column comments, Spark reconstructs the schema from the SELECT clause and drops the original metadata, defaulting all string-like types to string.
Uma Mahesh D

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now