-werners-
Esteemed Contributor III

It depends on what you mean by 'process'.
Spark can read several files at once.  All you need is the path to a directory with files.
Then you can read the whole directory using spark.read.parquet/csv/json/... (depends on your file format).

It is important however that all files have the same schema (columns), otherwise this approach will not work.

Is this what you are looking for? Or do you also need help with linking your data lake to databricks?

View solution in original post