Comparing schemas of two dataframes


So I was comparing schemas of two different dataframe using this code:


>>> df1.schema == df2.schema
Out: False


But the thing is, both the schemas are completely equal.

When digging deeper I realized that some of the StructFields() that should have been equal have different metadata property


{'name': 'customer_id', 'dataType': StringType(), 'nullable': True, 'metadata': {}}
{'name': 'customer_id', 'dataType': StringType(), 'nullable': True, 'metadata': {'scale': 0}}


What does this metadata property do?



>>> all(str(x) == str(y) for x, y in zip(df1.schema, df2.schema))
Out: True



Esteemed Contributor III

Hi @dream ,

In this case, you can go with dataframe.dtypes for comparing the schema or datatypes for two dataframe
Metadata store information about column properties

