cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot create a table having a column whose name contains commas in Hive metastore.

Shafi
New Contributor III

Hi

Tried to create a delta table from spark data frame using below command:

destination_path = "/dbfs/mnt/kidneycaredevstore/delta/df_corr_feats_spark_4"

df_corr_feats_spark.write.format("delta").option("delta.columnMapping.mode", "name").option("path",destination_path).saveAsTable("CKD_Features_4")

Getting below error:

AnalysisException: Cannot create a table having a column whose name contains commas in Hive metastore. Table: `default`.`abc_features_4`; Column: Adverse, abc initial encounter

Please note that there are around 6k columns in this data frame and it is developed by data scientist generate feature. So, we cannot rename columns.

How to fix this error. Any help will be appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions

Pat
Honored Contributor III

Hi @Shafiul Alam​ ,

yeah it was what I would do old days. Rename the column, I used this as an example: re.sub(r'[^0-9a-zA-Z]+', "_", col)

The issue is here that hive_metastore doesn't allow names with commas you are right. The documentation must be related to the Databricks implementation of metastore - it's confusing in the documentation sometimes

It was fine for tables in Unity Catalog:

image 

but for hive_metastore it throws an error:

image

View solution in original post

4 REPLIES 4

Pat
Honored Contributor III

Hi @Shafiul Alam​ ,

who gave those names to columns? 🙂

you can rename you columns, replace spaces / special characters, for example:

%python
import re
list_of_columns = df_corr_feats_spark.colums
renamed_list_of_columns = [ re.sub(r'[^0-9a-zA-Z]+', "_", col) for col in list_of_columns]
df_corr_feats_spark.toDF(*new_column_name_list)

thanks,

Pat

Shafi
New Contributor III

@Pat Sienkiewicz​ , Thanks for responding.

So, does this mean that the delta table column cannot contain any non-ascii characters? I thought option("delta.columnMapping.mode", "name") handles columns with non-ascii characters which is a feature from DBR > 10.2. But, looks like Metastore is not supporting such column naming.

Thanks again for your help.

Pat
Honored Contributor III

Hi @Shafiul Alam​ ,

yeah it was what I would do old days. Rename the column, I used this as an example: re.sub(r'[^0-9a-zA-Z]+', "_", col)

The issue is here that hive_metastore doesn't allow names with commas you are right. The documentation must be related to the Databricks implementation of metastore - it's confusing in the documentation sometimes

It was fine for tables in Unity Catalog:

image 

but for hive_metastore it throws an error:

image

Shafi
New Contributor III

@Pat Sienkiewicz​ , thanks a lot for sharing this suggestion

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group