โ11-17-2022 11:39 PM
Hi
Tried to create a delta table from spark data frame using below command:
destination_path = "/dbfs/mnt/kidneycaredevstore/delta/df_corr_feats_spark_4"
df_corr_feats_spark.write.format("delta").option("delta.columnMapping.mode", "name").option("path",destination_path).saveAsTable("CKD_Features_4")
Getting below error:
AnalysisException: Cannot create a table having a column whose name contains commas in Hive metastore. Table: `default`.`abc_features_4`; Column: Adverse, abc initial encounter
Please note that there are around 6k columns in this data frame and it is developed by data scientist generate feature. So, we cannot rename columns.
How to fix this error. Any help will be appreciated.
โ11-19-2022 12:24 AM
Hi @Shafiul Alamโ ,
yeah it was what I would do old days. Rename the column, I used this as an example: re.sub(r'[^0-9a-zA-Z]+', "_", col)
The issue is here that hive_metastore doesn't allow names with commas you are right. The documentation must be related to the Databricks implementation of metastore - it's confusing in the documentation sometimes
It was fine for tables in Unity Catalog:
but for hive_metastore it throws an error:
โ11-18-2022 05:07 AM
Hi @Shafiul Alamโ ,
who gave those names to columns? ๐
you can rename you columns, replace spaces / special characters, for example:
%python
import re
list_of_columns = df_corr_feats_spark.colums
renamed_list_of_columns = [ re.sub(r'[^0-9a-zA-Z]+', "_", col) for col in list_of_columns]
df_corr_feats_spark.toDF(*new_column_name_list)
thanks,
Pat
โ11-18-2022 11:29 PM
@Pat Sienkiewiczโ , Thanks for responding.
So, does this mean that the delta table column cannot contain any non-ascii characters? I thought option("delta.columnMapping.mode", "name") handles columns with non-ascii characters which is a feature from DBR > 10.2. But, looks like Metastore is not supporting such column naming.
Thanks again for your help.
โ11-19-2022 12:24 AM
Hi @Shafiul Alamโ ,
yeah it was what I would do old days. Rename the column, I used this as an example: re.sub(r'[^0-9a-zA-Z]+', "_", col)
The issue is here that hive_metastore doesn't allow names with commas you are right. The documentation must be related to the Databricks implementation of metastore - it's confusing in the documentation sometimes
It was fine for tables in Unity Catalog:
but for hive_metastore it throws an error:
โ11-21-2022 10:23 AM
@Pat Sienkiewiczโ , thanks a lot for sharing this suggestion
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group