@Lakshmi Jayaramanโ :
It's possible that the issue is related to the encoding used when reading the Delta table using the Python script. One solution is to specify the encoding explicitly when reading the table.
You can try to read the table using the delta package in Python and specifying the encoding as follows:
from delta.tables import DeltaTable
deltaTable = DeltaTable.forPath(spark, "/path/to/table")
df = deltaTable.toDF()
# Explicitly specify the encoding when displaying the columns
for col in df.columns:
print(col.encode('utf-8'))
This should display the column names in UTF-8 format. You can then use these column names to reference the columns in the DataFrame.
If this doesn't work, you can try to read the Delta table using Spark SQL in your Python script as follows:
df = spark.read.format("delta").load("/path/to/table")
# Explicitly specify the encoding when displaying the columns
for col in df.columns:
print(col.encode('utf-8'))
This should also display the column names in UTF-8 format.
If neither of these solutions work, it's possible that the issue is related to the version of Delta Lake used by the Python script. Make sure that you are using a version of Delta Lake that is compatible with the Delta table properties that you have set. You can check the version of Delta Lake used in Databricks by running the following command:
%sh
cat /databricks/spark/python/lib/python3.7/site-packages/delta/VERSION
Make sure that the version of Delta Lake used in your Python script matches the version used in Databricks.