jAAmes_bentley
Databricks Partner

I think it depends what your overall usecase is.

  • If you're looking to extract text from images / documents specifically using Databricks then you could consider ai_parse which provides a structured extraction of text and OCR content from files: ai_parse_document function | Databricks Documentation
  • If you're looking to query an LLM in bulk / batch, you should consider calling Claude with ai_query, which supports structured outputs to a certain degree using the responseFormat argument: Databricks Documentation
  • If you're looking to ping an LLM endpoint in a more one-at-a-time way, then you'll need to query the Databricks endpoints somehow.

In the end, I wouldn't say LangChain is an unecasserily heavy framework, and it carries a lot of tools, docs, and examples which can help you upskill quickly. If you really want to keep it as minimal as possible, then use the openai library. However, as said, I'd personally recommend the LangChain links I've given above.