<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: From RAG Demo to Production on Databricks: 7 Things Teams Should Validate First in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/from-rag-demo-to-production-on-databricks-7-things-teams-should/m-p/158527#M54731</link>
    <description>&lt;P&gt;Thanks for reading. I’m especially interested in hearing from people who have worked on real RAG or GenAI workflows.&lt;/P&gt;&lt;P&gt;Which one has been the biggest challenge for your team?&lt;/P&gt;&lt;P&gt;1. Choosing the right source data&lt;BR /&gt;2. Access control and governance&lt;BR /&gt;3. Improving retrieval quality&lt;BR /&gt;4. Evaluating groundedness&lt;BR /&gt;5. Monitoring cost and latency&lt;BR /&gt;6. Getting business users to trust the answers&lt;/P&gt;&lt;P&gt;For me, retrieval quality and evaluation are usually where demo systems start to become real production systems.&lt;/P&gt;</description>
    <pubDate>Mon, 08 Jun 2026 01:03:23 GMT</pubDate>
    <dc:creator>naveen0808</dc:creator>
    <dc:date>2026-06-08T01:03:23Z</dc:date>
    <item>
      <title>From RAG Demo to Production on Databricks: 7 Things Teams Should Validate First</title>
      <link>https://community.databricks.com/t5/data-engineering/from-rag-demo-to-production-on-databricks-7-things-teams-should/m-p/158526#M54730</link>
      <description>&lt;H3&gt;From RAG Demo to Production on Databricks: 7 Things Teams Should Validate&amp;nbsp;First&lt;/H3&gt;&lt;P class=""&gt;By Naveen Ayalla&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Many teams can build a RAG demo quickly.&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;Upload documents, create embeddings, connect a model, ask a question, and show an answer.&lt;/P&gt;&lt;P class=""&gt;But production is different.&lt;/P&gt;&lt;P class=""&gt;In production, the question is not only: “Can the model answer?”&lt;/P&gt;&lt;P class=""&gt;The real question is:&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Can the system answer accurately, securely, consistently, and with enough trust for real business users?&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;I have seen many GenAI ideas slow down after the demo stage because the team did not validate governance, retrieval quality, evaluation, monitoring, or ownership early enough.&lt;/P&gt;&lt;P class=""&gt;Here is a simple checklist I use when thinking about RAG workflows on Databricks.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="naveen0808_0-1780880239856.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/27625iEDB10A05386D13EC/image-size/large?v=v2&amp;amp;px=999" role="button" title="naveen0808_0-1780880239856.png" alt="naveen0808_0-1780880239856.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;From Online&amp;nbsp;&lt;/P&gt;&lt;H3&gt;My 7-point production checklist&lt;/H3&gt;&lt;H3&gt;1. Start with a focused use&amp;nbsp;case&lt;/H3&gt;&lt;P class=""&gt;A RAG system should not begin with “let’s index everything.”&lt;/P&gt;&lt;P class=""&gt;It should begin with a specific business problem.&lt;/P&gt;&lt;P class=""&gt;For example:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Help support teams answer product questions faster.&lt;/LI&gt;&lt;LI&gt;Help analysts search internal data documentation.&lt;/LI&gt;&lt;LI&gt;Help engineers troubleshoot pipeline failures.&lt;/LI&gt;&lt;LI&gt;Help business users understand policies or procedures.&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;A focused use case makes it easier to choose the right data, evaluate quality, and measure success.&lt;/P&gt;&lt;H3&gt;2. Use trusted data, not just available data&lt;/H3&gt;&lt;P class=""&gt;Just because data exists does not mean it should be used in a GenAI workflow.&lt;/P&gt;&lt;P class=""&gt;Before indexing content, I like to ask:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Who owns this data?&lt;/LI&gt;&lt;LI&gt;Is it current?&lt;/LI&gt;&lt;LI&gt;Is it approved for this use case?&lt;/LI&gt;&lt;LI&gt;Does it contain sensitive information?&lt;/LI&gt;&lt;LI&gt;Who should be allowed to access it?&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;Bad source data creates bad AI answers. Clean and trusted data is the foundation.&lt;/P&gt;&lt;H3&gt;3. Add metadata before retrieval&lt;/H3&gt;&lt;P class=""&gt;Metadata is often ignored in early RAG demos, but it becomes very important in production.&lt;/P&gt;&lt;P class=""&gt;Useful metadata may include:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Document owner&lt;/LI&gt;&lt;LI&gt;Source system&lt;/LI&gt;&lt;LI&gt;Updated date&lt;/LI&gt;&lt;LI&gt;Department&lt;/LI&gt;&lt;LI&gt;Product name&lt;/LI&gt;&lt;LI&gt;Region&lt;/LI&gt;&lt;LI&gt;Sensitivity level&lt;/LI&gt;&lt;LI&gt;Access group&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;This helps with filtering, troubleshooting, access control, and improving retrieval quality.&lt;/P&gt;&lt;H3&gt;4. Treat governance as part of the architecture&lt;/H3&gt;&lt;P class=""&gt;For enterprise RAG, governance should not be added at the end.&lt;/P&gt;&lt;P class=""&gt;If a user is not allowed to access a document directly, they should not be able to access it through an AI assistant either.&lt;/P&gt;&lt;P class=""&gt;This is why governance, permissions, lineage, and auditability are important parts of the design. The AI system should not become a shortcut around data governance.&lt;/P&gt;&lt;H3&gt;5. Evaluate retrieval separately from the final&amp;nbsp;answer&lt;/H3&gt;&lt;P class=""&gt;When a RAG answer is wrong, the model is not always the only problem.&lt;/P&gt;&lt;P class=""&gt;Sometimes the system retrieved the wrong document.&lt;BR /&gt;Sometimes the right document was missing.&lt;BR /&gt;Sometimes the chunk was incomplete.&lt;BR /&gt;Sometimes the source was outdated.&lt;BR /&gt;Sometimes the model ignored the context.&lt;/P&gt;&lt;P class=""&gt;That is why I prefer to evaluate two things separately:&lt;/P&gt;&lt;P class=""&gt;What to evaluateQuestion to askRetrieval qualityDid we retrieve the right context?Answer qualityDid the model use that context correctly?&lt;/P&gt;&lt;P class=""&gt;This makes troubleshooting much easier.&lt;/P&gt;&lt;H3&gt;6. Tell the model when not to&amp;nbsp;answer&lt;/H3&gt;&lt;P class=""&gt;One of the most useful instructions in enterprise RAG is simple:&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;If the retrieved context is not enough, say that the information is not available instead of guessing.&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;This sounds basic, but it matters.&lt;/P&gt;&lt;P class=""&gt;For business users, a confident wrong answer is worse than a clear limitation.&lt;/P&gt;&lt;H3&gt;7. Monitor after&amp;nbsp;launch&lt;/H3&gt;&lt;P class=""&gt;A RAG system is not finished after deployment.&lt;/P&gt;&lt;P class=""&gt;Documents change.&lt;BR /&gt;Users ask new questions.&lt;BR /&gt;Models change.&lt;BR /&gt;Costs change.&lt;BR /&gt;Business rules change.&lt;/P&gt;&lt;P class=""&gt;After launch, teams should monitor:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;User feedback&lt;/LI&gt;&lt;LI&gt;Failed questions&lt;/LI&gt;&lt;LI&gt;Retrieval quality&lt;/LI&gt;&lt;LI&gt;Latency&lt;/LI&gt;&lt;LI&gt;Cost&lt;/LI&gt;&lt;LI&gt;Error rate&lt;/LI&gt;&lt;LI&gt;Outdated sources&lt;/LI&gt;&lt;LI&gt;Low-confidence answers&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;The best RAG systems improve continuously.&lt;/P&gt;&lt;H3&gt;Final thought&lt;/H3&gt;&lt;P class=""&gt;To me, production RAG is not just an LLM connected to a vector index.&lt;/P&gt;&lt;P class=""&gt;It is a governed data product.&lt;/P&gt;&lt;P class=""&gt;It needs trusted data, metadata, permissions, evaluation, monitoring, and clear ownership.&lt;/P&gt;&lt;P class=""&gt;Databricks can be a strong foundation for this type of workflow because data engineering, governance, machine learning, and AI workflows can be connected through the lakehouse approach.&lt;/P&gt;&lt;P class=""&gt;I am curious how others are handling this in real projects:&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;What is the hardest part of taking RAG from demo to production — governance, retrieval quality, evaluation, monitoring, cost, or user adoption?&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;#Generative AI #data Engineering&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2026 01:00:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/from-rag-demo-to-production-on-databricks-7-things-teams-should/m-p/158526#M54730</guid>
      <dc:creator>naveen0808</dc:creator>
      <dc:date>2026-06-08T01:00:36Z</dc:date>
    </item>
    <item>
      <title>Re: From RAG Demo to Production on Databricks: 7 Things Teams Should Validate First</title>
      <link>https://community.databricks.com/t5/data-engineering/from-rag-demo-to-production-on-databricks-7-things-teams-should/m-p/158527#M54731</link>
      <description>&lt;P&gt;Thanks for reading. I’m especially interested in hearing from people who have worked on real RAG or GenAI workflows.&lt;/P&gt;&lt;P&gt;Which one has been the biggest challenge for your team?&lt;/P&gt;&lt;P&gt;1. Choosing the right source data&lt;BR /&gt;2. Access control and governance&lt;BR /&gt;3. Improving retrieval quality&lt;BR /&gt;4. Evaluating groundedness&lt;BR /&gt;5. Monitoring cost and latency&lt;BR /&gt;6. Getting business users to trust the answers&lt;/P&gt;&lt;P&gt;For me, retrieval quality and evaluation are usually where demo systems start to become real production systems.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2026 01:03:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/from-rag-demo-to-production-on-databricks-7-things-teams-should/m-p/158527#M54731</guid>
      <dc:creator>naveen0808</dc:creator>
      <dc:date>2026-06-08T01:03:23Z</dc:date>
    </item>
  </channel>
</rss>

