<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Advanced RAG Retrieval (Reranking, Hierarchical, etc) in Databricks in Generative AI</title>
    <link>https://community.databricks.com/t5/generative-ai/advanced-rag-retrieval-reranking-hierarchical-etc-in-databricks/m-p/104462#M690</link>
    <description>&lt;H3&gt;Issue with current documentation:&lt;/H3&gt;&lt;P&gt;I wish to perform advanced RAG using langchain, in Databricks. In the documentation, they tell how to use the vector endpoint url, and index stored in catalogs. But I could not find any advanced RAG algos that are easily implemented in Databricks. Can you please advise me on a step-wise documentation on how I can proceed with this task?&lt;/P&gt;&lt;P&gt;I would appreciate if we can implement advanced RAG with minimum reliance on catalogs and endpoints, but rather langchain-exclusive tools that make stuff easier to do&lt;/P&gt;&lt;H3&gt;Idea or request for content:&lt;/H3&gt;&lt;P&gt;Seperate sections- each with a advanced rag technique, and how to use that in Databricks with minimum reliance on catalogs and endpoints, but rather langchain-exclusive tools that make stuff easier to do.&lt;/P&gt;</description>
    <pubDate>Tue, 07 Jan 2025 07:10:35 GMT</pubDate>
    <dc:creator>meetiasha</dc:creator>
    <dc:date>2025-01-07T07:10:35Z</dc:date>
    <item>
      <title>Advanced RAG Retrieval (Reranking, Hierarchical, etc) in Databricks</title>
      <link>https://community.databricks.com/t5/generative-ai/advanced-rag-retrieval-reranking-hierarchical-etc-in-databricks/m-p/104462#M690</link>
      <description>&lt;H3&gt;Issue with current documentation:&lt;/H3&gt;&lt;P&gt;I wish to perform advanced RAG using langchain, in Databricks. In the documentation, they tell how to use the vector endpoint url, and index stored in catalogs. But I could not find any advanced RAG algos that are easily implemented in Databricks. Can you please advise me on a step-wise documentation on how I can proceed with this task?&lt;/P&gt;&lt;P&gt;I would appreciate if we can implement advanced RAG with minimum reliance on catalogs and endpoints, but rather langchain-exclusive tools that make stuff easier to do&lt;/P&gt;&lt;H3&gt;Idea or request for content:&lt;/H3&gt;&lt;P&gt;Seperate sections- each with a advanced rag technique, and how to use that in Databricks with minimum reliance on catalogs and endpoints, but rather langchain-exclusive tools that make stuff easier to do.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jan 2025 07:10:35 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/advanced-rag-retrieval-reranking-hierarchical-etc-in-databricks/m-p/104462#M690</guid>
      <dc:creator>meetiasha</dc:creator>
      <dc:date>2025-01-07T07:10:35Z</dc:date>
    </item>
    <item>
      <title>Re: Advanced RAG Retrieval (Reranking, Hierarchical, etc) in Databricks</title>
      <link>https://community.databricks.com/t5/generative-ai/advanced-rag-retrieval-reranking-hierarchical-etc-in-databricks/m-p/138232#M1363</link>
      <description>&lt;P&gt;Greetings&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/141051"&gt;@meetiasha&lt;/a&gt;&amp;nbsp;, yes—there’s a gap between Databricks’ basic “vector endpoint + catalog index” examples and truly advanced RAG, so below is a step‑wise, LangChain‑first playbook you can run entirely on Databricks notebooks with local vector stores (FAISS/Chroma), secrets, and LCEL—no Unity Catalog tables or Vector Search endpoints required.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Minimal setup on Databricks&lt;/STRONG&gt;&lt;BR /&gt;- Install packages in a notebook cell: %pip install langchain langchain-openai langchain-text-splitters langchain-chroma chromadb faiss-cpu, which uses notebook‑scoped libraries that don’t affect the whole cluster.&lt;BR /&gt;- Persist vector stores locally on DBFS (for example, persist_directory="/dbfs/FileStore/rag/chroma" or FAISS index files under /dbfs) and manage paths with dbutils.fs.&lt;BR /&gt;- Store API keys (OpenAI, Cohere, etc.) in Databricks Secrets and load them at runtime via dbutils.secrets.get to avoid hardcoding credentials.&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;# Databricks notebook cell&lt;BR /&gt;# %pip install langchain langchain-openai langchain-text-splitters langchain-chroma chromadb faiss-cpu&lt;/P&gt;
&lt;P&gt;from langchain_openai import OpenAIEmbeddings, ChatOpenAI&lt;BR /&gt;from langchain_text_splitters import RecursiveCharacterTextSplitter&lt;BR /&gt;from langchain_chroma import Chroma&lt;/P&gt;
&lt;P&gt;openai_key = dbutils.secrets.get("my-scope", "OPENAI_API_KEY")&lt;BR /&gt;emb = OpenAIEmbeddings(api_key=openai_key)&lt;BR /&gt;splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=100)&lt;/P&gt;
&lt;P&gt;# Persist a Chroma store under DBFS (no catalogs/endpoints)&lt;BR /&gt;vs = Chroma(collection_name="docs", embedding_function=emb,&lt;BR /&gt;persist_directory="/dbfs/FileStore/rag/chroma")&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Multi-Query Retrieval (query expansion)&lt;/STRONG&gt;&lt;BR /&gt;- MultiQueryRetriever uses an LLM to reformulate a single question into several variants, broadening recall and reducing single‑query blind spots.&lt;BR /&gt;- This integrates directly with any vector store retriever, and the chain is composed with LCEL for clean, production‑ready orchestration.&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;from langchain.retrievers.multi_query import MultiQueryRetriever&lt;BR /&gt;from langchain.schema.runnable import RunnablePassthrough&lt;BR /&gt;from langchain_core.output_parsers import StrOutputParser&lt;BR /&gt;from langchain_core.prompts import ChatPromptTemplate&lt;/P&gt;
&lt;P&gt;llm = ChatOpenAI(api_key=openai_key, model="gpt-4o-mini", temperature=0)&lt;BR /&gt;retriever = vs.as_retriever(search_kwargs={"k": 6})&lt;/P&gt;
&lt;P&gt;mqr = MultiQueryRetriever.from_llm(retriever=retriever, llm=llm)&lt;/P&gt;
&lt;P&gt;prompt = ChatPromptTemplate.from_messages([&lt;BR /&gt;("system", "Answer using only the provided context."),&lt;BR /&gt;("human", "Context:\n{context}\n\nQuestion: {question}")&lt;BR /&gt;])&lt;/P&gt;
&lt;P&gt;def format_docs(docs): return "\n\n".join(d.page_content for d in docs)&lt;/P&gt;
&lt;P&gt;rag = ({"context": mqr | format_docs, "question": RunnablePassthrough()}&lt;BR /&gt;| prompt | llm | StrOutputParser())&lt;/P&gt;
&lt;P&gt;print(rag.invoke("What changed in the latest policy?"))&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Parent–Child (ParentDocumentRetriever)&lt;/STRONG&gt;&lt;BR /&gt;- ParentDocumentRetriever stores small chunks for retrieval but returns their larger parent document to preserve context and cut fragment errors.&lt;BR /&gt;- Pair it with Chroma/FAISS for storage and an in‑memory or simple key‑value store for parent documents; it remains local and catalog‑free.&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;from langchain.retrievers import ParentDocumentRetriever&lt;BR /&gt;from langchain.storage import InMemoryStore&lt;/P&gt;
&lt;P&gt;parent_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)&lt;BR /&gt;child_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)&lt;BR /&gt;store = InMemoryStore()&lt;/P&gt;
&lt;P&gt;parent_ret = ParentDocumentRetriever(&lt;BR /&gt;vectorstore=vs,&lt;BR /&gt;docstore=store,&lt;BR /&gt;child_splitter=child_splitter,&lt;BR /&gt;parent_splitter=parent_splitter,&lt;BR /&gt;)&lt;BR /&gt;# parent_ret.add_documents(docs) # run once to index&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;Contextual Compression with reranking&lt;/STRONG&gt;&lt;BR /&gt;- ContextualCompressionRetriever runs a reranker or compressor over initially retrieved documents to keep only the most answer‑bearing snippets.&lt;BR /&gt;- You can use an LLM‑based or third‑party reranker (for example, Cohere or Contextual AI) to substantially improve precision at low k.&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;from langchain.retrievers.contextual_compression import ContextualCompressionRetriever&lt;BR /&gt;from langchain.retrievers.document_compressors import CohereRerank&lt;/P&gt;
&lt;P&gt;cohere_key = dbutils.secrets.get("my-scope", "COHERE_API_KEY")&lt;BR /&gt;compressor = CohereRerank(api_key=cohere_key, top_n=6)&lt;BR /&gt;base_ret = vs.as_retriever(search_kwargs={"k": 20})&lt;BR /&gt;cc_ret = ContextualCompressionRetriever(base_retriever=base_ret, base_compressor=compressor)&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;HyDE (Hypothetical Document Embeddings)&lt;/STRONG&gt;&lt;BR /&gt;- HyDE uses an LLM to synthesize a hypothetical document for the user’s query, embeds that synthetic text, and searches with that embedding to boost recall in sparse or noisy corpora.&lt;BR /&gt;- In LangChain, wrap an LLM and embeddings with HypotheticalDocumentEmbedder and use it in place of a standard embedding function to build or query a local vector store.&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;from langchain.chains import HypotheticalDocumentEmbedder&lt;BR /&gt;hyde_emb = HypotheticalDocumentEmbedder.from_llm(llm, emb, "web_search")&lt;BR /&gt;# Use hyde_emb with your vector store (e.g., for query embeddings or indexing variants)&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;Self‑Query Retriever (metadata‑aware filtering)&lt;/STRONG&gt;&lt;BR /&gt;- SelfQueryRetriever lets an LLM translate natural‑language filters (time ranges, authors, sections) into vector‑store search parameters, improving retrieval control without brittle manual parsing.&lt;BR /&gt;- It’s ideal when documents have rich metadata and you want free‑form queries to map to filters like tags or date constraints.&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;from langchain.retrievers.self_query.base import SelfQueryRetriever&lt;BR /&gt;sqr = SelfQueryRetriever(vectorstore=vs, query_constructor=llm, use_original_query=True)&lt;BR /&gt;docs = sqr.get_relevant_documents("security changes in Q3 2024, only policy PDFs")&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;Multi‑Vector and summary vectors&lt;/STRONG&gt;&lt;BR /&gt;- MultiVectorRetriever stores multiple embeddings per document (for example, raw chunk + summary + title) to expand matches and strengthen recall on terse queries.&lt;BR /&gt;- This pairs well with compression or reranking so the final context window remains concise despite broader initial matches.&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;from langchain.retrievers.multi_vector import MultiVectorRetriever&lt;BR /&gt;mvr = MultiVectorRetriever.from_documents(documents=docs, embeddings=emb, id_key="doc_id")&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;Retriever ensembles and routing&lt;/STRONG&gt;&lt;BR /&gt;- An ensemble can combine lexical (BM25/TF‑IDF) and vector retrievers and weight their scores, often outperforming any single retriever on heterogeneous data.&lt;BR /&gt;- With LCEL, dynamically route to different retrievers based on the query intent, then merge results before compression and generation.&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;from langchain.retrievers.ensemble import EnsembleRetriever&lt;BR /&gt;from langchain_community.retrievers import BM25Retriever&lt;/P&gt;
&lt;P&gt;bm25 = BM25Retriever.from_texts([d.page_content for d in docs])&lt;BR /&gt;vector_ret = vs.as_retriever(search_kwargs={"k": 8})&lt;BR /&gt;ensemble = EnsembleRetriever(retrievers=[bm25, vector_ret], weights=[0.4, 0.6])&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;Graph‑augmented RAG (optional)&lt;/STRONG&gt;&lt;BR /&gt;- For relationship‑heavy domains, add a knowledge graph (for example, Neo4j) and use graph queries alongside vector search to ground answers in entities and edges.&lt;BR /&gt;- LangChain provides advanced RAG templates with Neo4j that you can adapt to your DBFS‑persisted embeddings workflow.&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;# See neo4j-advanced-rag template in LangChain; run graph retrieval then fuse with vector hits&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;Local vector stores on DBFS (FAISS/Chroma)&lt;/STRONG&gt;&lt;BR /&gt;- FAISS and Chroma both run fully local, persist to files, and avoid reliance on Databricks Vector Search or Unity Catalog, fitting the “LangChain‑exclusive” requirement.&lt;BR /&gt;- Use DBFS paths for persistence so jobs and notebooks can share indexes predictably across runs without external services.&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;# Persist FAISS index locally (example)&lt;BR /&gt;from langchain_community.vectorstores import FAISS&lt;BR /&gt;faiss_vs = FAISS.from_texts([d.page_content for d in docs], embedding=emb)&lt;BR /&gt;faiss_vs.save_local("/dbfs/FileStore/rag/faiss_index")&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;Orchestrate with LCEL&lt;/STRONG&gt;&lt;BR /&gt;- LangChain Expression Language (LCEL) composes retrievers, prompts, LLMs, and parsers into a single, efficient graph that’s easy to test and deploy on jobs.&lt;BR /&gt;- Build a standard pattern: {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser().&lt;/P&gt;
&lt;P&gt;```python&lt;BR /&gt;from langchain_core.schema.runnable import RunnablePassthrough&lt;BR /&gt;from langchain_core.output_parsers import StrOutputParser&lt;BR /&gt;# rag chain from earlier sections already follows LCEL; just reuse it in jobs&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;Putting it together (recommended baseline)&lt;/STRONG&gt;&lt;BR /&gt;- Start with ParentDocumentRetriever + MultiQueryRetriever to improve recall while returning coherent parent docs, then add ContextualCompressionRetriever with a reranker to tighten final context.&lt;BR /&gt;- Persist your Chroma/FAISS indexes to DBFS, load secrets with dbutils.secrets, and manage packages with %pip to keep everything self‑contained and independent of catalogs and endpoints.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hoping this helps you.&lt;/P&gt;
&lt;P&gt;Cheers, Louis.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 08 Nov 2025 22:16:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/advanced-rag-retrieval-reranking-hierarchical-etc-in-databricks/m-p/138232#M1363</guid>
      <dc:creator>Louis_Frolio</dc:creator>
      <dc:date>2025-11-08T22:16:52Z</dc:date>
    </item>
  </channel>
</rss>

