Re: Issue with MongoDB Void Null Type in Databrick...

lingareddy_Alva · ‎06-16-2026

When MongoDB has fields where all sampled documents are null, Spark cannot infer a data type, so it assigns NullType (void). Delta Lake rejects this because it needs a concrete type for Parquet storage.

My suggestion would be the Fix — Two Layers

Layer 1 — Explicit Schema on Read
Declare the schema yourself before reading from MongoDB. Spark skips inference entirely and uses your declared types
even if values are null at runtime, the column type is concrete and Delta accepts it.

Layer 2 — Defensive Null Cast (Safety Net)
After reading, scan all columns and cast any surviving NullType to StringType.
This catches edge cases where the MongoDB connector overrides your declared schema internally for certain field patterns.

LR