Erik_L
Contributor II

Given the new information I appended, I looked into the Delta caching and I can disable it:

.option("spark.databricks.io.cache.enabled", False)

This works as a work around while I read these files in to save them locally in DBFS, but does it have performance repercussions? I'm only doing this to ingest files from S3 uploaded from an external process. I'm worried there might be a larger number of reads from S3 increasing ingestion costs.

View solution in original post