Data System & Architecture - PySpark Assignment

200649021 · ‎11-21-2025

Title: Spark Structured Streaming – Airport Counts by Country

This notebook demonstrates how to set up a Spark Structured Streaming job in Databricks Community Edition.
It reads new CSV files from a Unity Catalog volume, processes them to count airports per country, and writes results to the console.
The code includes explicit schema definition, checkpointing, and supported triggers for Community Edition clusters.