Hi Chris,
You can use Auto Loader, as it is the most reliable way to pick up each new Parquet file as it lands in S3 and make those records immediately queryable in Databricks. It does this by incrementally discovering files and writing them into a Delta table (or streaming table), avoiding manual refreshes or partition repairs on your external Parquet table. It makes sure that already processed files are also not processed again, and it picks up only the new files.
Ref Doc - https://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader
Auto Loader (“cloudFiles”) continuously detects new objects in S3 and ingests them with checkpointed state, so newly arrived records are available without you running REPAIR or REFRESH commands.
If you enable “file events” on your Unity Catalog external location and set cloudFiles.useManagedFileEvents = true, Auto Loader uses a Databricks-managed event cache (SNS/SQS under the hood on AWS) for near real‑time, low‑cost discovery instead of repeated directory listings.
But a point to note here is that Auto Loader does not “refresh” your existing UC external Parquet table. MSCK or REFRESH TABLE doesn't work when UC hasn't discrovered the new partitions/files yet. Its better to choose Auto Loader if you want continuous ingestion and immediate queryability without manual maintenance; it’s the best path for growing file counts with near real‑time updates.