cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

lnkd.in

pvignesh92
Honored Contributor

Databricks Auto Loader is an interesting feature that can be used to load data incrementally.

✳ It can process new data files as they arrive in the cloud object stores

✳ It can be used to ingest JSON, CSV, PARQUET, AVRO, ORC, TEXT and even Binary file formats

✳ Auto Loader can support a scale of even million files per hour. It maintains the state information at a checkpoint location in a key-value store called RocksDB. As the state is now maintained in the checkpoint, it can resume from where it was left off even in times of failure and can guarantee exactly-once semantics.

Please find my write-up on Databricks AutoLoader on Medium here. Happy for any feedbacks 🙂

🔅 Databricks Autoloader Series- Accelerating Incremental Data Ingestion: https://lnkd.in/ew3vaPmp

🔅 Databricks Auto Loader Series— The basics: https://lnkd.in/e2zanWfc

1 REPLY 1

Ajay-Pandey
Esteemed Contributor III

Thanks for sharing

Ajay Kumar Pandey

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group