Hi dear community,
When I used to work in the Hadoop ecosystem with HDS the landing zone was our raw layer, and we used to use AVRO format for the serialization of this raw data (for the schema evolution feature), only assigning names to columns but not enforcing any data type, each column was stored as string type.
In the medallion architecture, bronze represents the raw layer. However, I need clarity regarding best practices here. My question is: Should we use delta tables and treat each column as a string type, or should we enforce a specific data type at this layer? Alternatively, should we allow the delta table to handle data discovery for us?"
Thank you very much for the clarification on the best practices and alternatives, pros and cons.
Best Regards