DAIS 2026 · Speaker Spotlight
A conversation
with Yevgeniy Ilyin
On effective document management and retrieval for generative AI — and how a 20-year-old field is being rewritten on Databricks.
The Session
Time
10:20 AM to 11:00 AM
( 40 minutes)
Location
San Francisco + Virtual
The DAIS 2026 Speaker Spotlight is a series where we hand the mic to the speakers heading to Data + AI Summit and let them answer five short questions — in their own voice, no press-release polish.
Below, Yevgeniy Ilyin on what's changed for document processing and retrieval in the GenAI era, and the full-stack tooling that's catching up. Lightly edited for length — otherwise, the words are his.
“
Document management and retrieval is a mature topic going back 15–20 years. Generative AI and LLMs have brought significant disruption to this area.
— Yevgeniy Ilyin
The topic
What is your talk about, and who is it for?
My talk is an advanced session about effective document management and retrieval for generative AI, aimed for practitioners and data scientists discovering Databricks.
Why this, why now
What's changed in the last 6–12 months that makes this topic urgent right now?
Document management and retrieval is a mature topic going back 15–20 years. However, generative AI and LLMs have brought significant disruption to this area. Many existing patterns are changing, and the field needs new architectures, approaches, and tools. Databricks is at the forefront of development and has recently built a full-stack of document processing and retrieval functionality. This talk gives a practical overview of the state of the art on Databricks.
The personal stake
Why are you the person giving this talk?
I have extensive exposure to customer use cases and hands-on experience with this topic. The talk was first delivered on a customer on-site in 2025. This year's talk on DAIS is an excellent opportunity to build an up-to-date real-world demo and provide a fresh perspective on this topic.
What you'll leave with
What will someone be able to do on Monday morning that they couldn't do before?
You'll be able to understand where the field is heading and how to build a modern document processing and retrieval solution on Databricks, using the latest technology and approaches backed by scientific research. You can significantly reduce time-to-solution across all current and future projects because you now understand which Databricks functionality to use.
The bigger picture
How does this fit into where Databricks — and data and AI more broadly — is heading?
The talk focuses on a data-centred and agentic AI view on document management and retrieval. It underlines the importance of end-to-end data management and governance even before starting any downstream development, reflecting the “shift-to-left” trend. The talk also shows the importance of managed solutions for reducing time-to-market and TCO.
A note from us
Speakers are the heart of DAIS, and helping the world hear your story is one of the best parts of our job.
Part of the DAIS 2026 Speaker Spotlight series — more voices dropping in the weeks ahead. Got a DAIS speaker you'd love to hear from next? Mention them in the comments — we're always listening.