cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to enforce schema check and benefit from badRecordsPath when using autoloader

Swann
New Contributor

We would like to have a robust reader that ensure that the data we read and write using the autoloader respect the schema which is provided to the autoloader reader.

We also provide the option "badRecordsPath" (refer to https://docs.databricks.com/spark/latest/spark-sql/handling-bad-records.html) which works fine with corrupted files etc.

We have an issue similar to the one documented in https://kb.databricks.com/data/wrong-schema-in-files.html where the DECIMAL(20, 0) found in the source files in incompatible with the LONG we specify in our schema.

The main question is then: Do you have a way to make spark log to location given in badRecordsPath when the above happens rather than raising an exception (from which we cannot know the file paths causing the issue). As all this is declarative is highly depends on the available options and the implementation of ""badRecordsPath".

0 REPLIES 0
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.