Re: Databricks SQL Error outputting sesntive data ...

seanstachff · ‎02-11-2025

Hi - I am using `from_json` with FAILFAST to correctly format some data using databricks SQL. However, this function can return the error "[MALFORMED_RECORD_IN_PARSING.WITHOUT_SUGGESTION] Malformed records are detected in record parsing" with the rest of the line being the data that caused the error.

Is there any way to prevent this from happening and is there anywhere else this can happen? The data I am working with is sensitive and I don't want it appearing in our logs.

NandiniN · ‎05-01-2025

Checking.

NandiniN · ‎05-01-2025

You could use

mode (default PERMISSIVE allows a mode for dealing with corrupt records during parsing.
- PERMISSIVE: when it meets a corrupted record, puts the malformed string into a field configured by columnNameOfCorruptRecord, and sets malformed fields to null. To keep corrupt records, you can set a string type field named columnNameOfCorruptRecord in an user-defined schema. If a schema does not have the field, it drops corrupt records during parsing. When inferring a schema, it implicitly adds a columnNameOfCorruptRecord field in an output schema.
columnNameOfCorruptRecord (default is the value specified in spark.sql.columnNameOfCorruptRecord allows renaming the new field having malformed string created by PERMISSIVE mode. This overrides spark.sql.columnNameOfCorruptRecord.

Doc - https://docs.databricks.com/aws/en/sql/language-manual/functions/from_json

Databricks SQL Error outputting sesntive data to logs