11-17-2022 11:39 PM
Hi
Tried to create a delta table from spark data frame using below command:
destination_path = "/dbfs/mnt/kidneycaredevstore/delta/df_corr_feats_spark_4"
df_corr_feats_spark.write.format("delta").option("delta.columnMapping.mode", "name").option("path",destination_path).saveAsTable("CKD_Features_4")
Getting below error:
AnalysisException: Cannot create a table having a column whose name contains commas in Hive metastore. Table: `default`.`abc_features_4`; Column: Adverse, abc initial encounter
Please note that there are around 6k columns in this data frame and it is developed by data scientist generate feature. So, we cannot rename columns.
How to fix this error. Any help will be appreciated.
11-19-2022 12:24 AM
Hi @Shafiul Alam ,
yeah it was what I would do old days. Rename the column, I used this as an example: re.sub(r'[^0-9a-zA-Z]+', "_", col)
The issue is here that hive_metastore doesn't allow names with commas you are right. The documentation must be related to the Databricks implementation of metastore - it's confusing in the documentation sometimes
It was fine for tables in Unity Catalog:
but for hive_metastore it throws an error:
11-18-2022 05:07 AM
Hi @Shafiul Alam ,
who gave those names to columns? 🙂
you can rename you columns, replace spaces / special characters, for example:
%python
import re
list_of_columns = df_corr_feats_spark.colums
renamed_list_of_columns = [ re.sub(r'[^0-9a-zA-Z]+', "_", col) for col in list_of_columns]
df_corr_feats_spark.toDF(*new_column_name_list)
thanks,
Pat
11-18-2022 11:29 PM
@Pat Sienkiewicz , Thanks for responding.
So, does this mean that the delta table column cannot contain any non-ascii characters? I thought option("delta.columnMapping.mode", "name") handles columns with non-ascii characters which is a feature from DBR > 10.2. But, looks like Metastore is not supporting such column naming.
Thanks again for your help.
11-19-2022 12:24 AM
Hi @Shafiul Alam ,
yeah it was what I would do old days. Rename the column, I used this as an example: re.sub(r'[^0-9a-zA-Z]+', "_", col)
The issue is here that hive_metastore doesn't allow names with commas you are right. The documentation must be related to the Databricks implementation of metastore - it's confusing in the documentation sometimes
It was fine for tables in Unity Catalog:
but for hive_metastore it throws an error:
11-21-2022 10:23 AM
@Pat Sienkiewicz , thanks a lot for sharing this suggestion
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group