Hey everyone,
I recently worked on building a modern financial data lakehouse using Spark Declarative Pipeline OSS (SDP OSS), Apache Iceberg, and AWS Glue Catalog.
The blog covers:
- Building declarative data pipelines with Spark
- Using Apache Iceberg as the table format
- Managing metadata with AWS Glue Catalog
- Streaming + batch style processing patterns
- End-to-end lakehouse architecture ideas
Blog link : https://medium.com/@pranavsadagopan/building-a-spark-declarative-pipeline-a-modern-financial-data-la...
Looking forward to feedback and discussions 🙂