Iโd love to build a quick "art of the possible" demo showing how easy it is to query unstructured PDFs using natural language. In Snowflake, I wired up a similar solution in ~2 hours just by following their tutorial guide.
Does anyone know the best way to replicate this in Databricks? Even betterโdoes Databricks have a similar step-by-step resource for NLP on PDFs? I did use this notebook but it is using structured data (Databricks documents chunked and embedded). I also tried Genie route and that's also expecting structured tables. No bueno there!
Basically, I have bunch of PDF files, which I would like to use natural language questions against them, even ask them to compare specific KPI present in one PDF document vs another. Hoping it is not too difficult to do this in Databricks!
Any guidance would be greatly appreciated!