cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Autoloader start and end date for ingestion

kmorton
New Contributor

I have been searching for a way to set up backfilling using autoloader with an option to set a "start_date" or "end_date". I am working on ingesting a massive file system but I don't want to ingest everything from the beginning. I have a start date that I want to perform the first big ingestion to populate the most recent data into my database and then over time slowly backfill the older data. Is this functionality currently in the autoloader settings, and if not, any suggestions on how to approach this issue?

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @kmortonDatabricks Auto Loader does support backfilling to capture any missed files with file notifications. This is achieved by using the cloudFiles.backfillInterval option to schedule regular backfills over your data. However, it does not specify an option to set a "start_date" or "end_date" for the backfill operation. 

As per your requirement, you want to ingest a massive file system, but not initially. However, the Auto Loader does not appear to have a direct setting or functionality to set a start date for the first significant ingestion.

You might have to manually manage the files you want to ingest initially and then use the backfill functionality to ingest older data slowly. This could involve moving or copying the files you want to ingest into a separate directory and pointing the Auto Loader to this directory.

Once this data has been ingested, you could point the Auto Loader to the older data's directory and use the backfill functionality to ingest this data over time. 

Please note that this is just a suggested approach and may need to be adjusted based on your specific needs and environment.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!