cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Is the delta schema enforcement flexible?

StephanieRivera
Valued Contributor II
Valued Contributor II

 In the sense that, is it possible to only check for column names or column data types or will it always be both?

1 REPLY 1

StephanieRivera
Valued Contributor II
Valued Contributor II

No, I do not believe that is possible. However, I would be interested in understanding a use case where that is ideal behavior.

How Does Schema Enforcement Work?

Delta Lake uses schema validation on write, which means that all new writes to a table are checked for compatibility with the target table’s schema at write time. If the schema is not compatible, Delta Lake cancels the transaction altogether (no data is written), and raises an exception to let the user know about the mismatch.

To determine whether a write to a table is compatible, Delta Lake uses the following rules. The DataFrame to be written:

  • Cannot contain any additional columns that are not present in the target table’s schema. Conversely, it’s OK if the incoming data doesn’t contain every column in the table – those columns will simply be assigned null values.
  • Cannot have column data types that differ from the column data types in the target table. If a target table’s column contains StringType data, but the corresponding column in the DataFrame contains IntegerType data, schema enforcement will raise an exception and prevent the write operation from taking place.
  • Can not contain column names that differ only by case. This means that you cannot have columns such as ‘Foo’ and ‘foo’ defined in the same table. While Spark can be used in case sensitive or insensitive (default) mode, Delta Lake is case-preserving but insensitive when storing the schema. Parquet is case sensitive when storing and returning column information. To avoid potential mistakes, data corruption or loss issues (which we’ve personally experienced at Databricks), we decided to add this restriction.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.