<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Leveraging AI Assistant in Data Engineering Workflows - Share Your Use Cases &amp;amp;amp; Best Practices in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/leveraging-ai-assistant-in-data-engineering-workflows-share-your/m-p/147939#M52798</link>
    <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hi Databricks Community,&lt;/P&gt;&lt;P&gt;I've been extensively using the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;AI Assistant in Databricks Workspace&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for my data engineering tasks and have seen significant productivity gains. I'm curious to learn how others are leveraging this capability and explore opportunities to standardize our approaches.&lt;/P&gt;&lt;H2&gt;&lt;span class="lia-unicode-emoji" title=":thinking_face:"&gt;🤔&lt;/span&gt; Questions for the Community&lt;/H2&gt;&lt;H3&gt;1.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;How are you using AI Assistant in your data engineering workflows?&lt;/STRONG&gt;&lt;/H3&gt;&lt;P&gt;I'd love to hear about:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Specific use cases where AI Assistant has been most valuable&lt;/LI&gt;&lt;LI&gt;Tasks that have become significantly faster or easier&lt;/LI&gt;&lt;LI&gt;Workflows you've automated or optimized using AI assistance&lt;/LI&gt;&lt;LI&gt;Any challenges or limitations you've encountered&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;2.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Standardizing Prompts Across Teams&lt;/STRONG&gt;&lt;/H3&gt;&lt;P&gt;My team wants to create a&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;shared prompt library&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;so everyone can benefit from well-crafted prompts. Specifically:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;How can we store standard prompts in Databricks Workspace&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;so they're easily accessible to all team members?&lt;/LI&gt;&lt;LI&gt;What's the best way to organize and version-control these prompts?&lt;/LI&gt;&lt;LI&gt;Are there existing patterns or frameworks (like COIE, RISEN, etc.) that work well for data engineering tasks?&lt;/LI&gt;&lt;LI&gt;How do you ensure prompt quality and consistency across your team?&lt;/LI&gt;&lt;/UL&gt;&lt;HR /&gt;&lt;H2&gt;&lt;span class="lia-unicode-emoji" title=":light_bulb:"&gt;💡&lt;/span&gt; Use Cases I'm Exploring&lt;/H2&gt;&lt;P&gt;Here are some data engineering scenarios where AI Assistant could add value.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Please add your own or share what's worked for you:&lt;/STRONG&gt;&lt;/P&gt;&lt;H3&gt;&lt;STRONG&gt;Code Development &amp;amp; Optimization&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Py Spark/SQL code generation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Converting business logic to optimized Spark code&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Performance tuning&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Analyzing slow queries and suggesting optimizations (broadcast joins, partitioning strategies, caching)&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Code refactoring&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Modernizing legacy code, improving readability, reducing technical debt&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Error diagnosis&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Troubleshooting OOM errors, data skew, shuffle issues&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Unit test generation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Creating test cases for data transformations&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Pipeline Development&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Delta Live Tables (DLT) pipeline creation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Generating DLT syntax from requirements&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Data quality checks&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Writing expectations and validation logic&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Incremental processing patterns&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Implementing CDC, SCD Type 2, merge logic&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Orchestration logic&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Designing workflow dependencies and error handling&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Data Modeling &amp;amp; Architecture&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Schema design&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Suggesting optimal table structures (medallion architecture, Data Vault, Kimball)&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Data lineage documentation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Generating documentation from code&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Migration assistance&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Converting SQL Server/Oracle patterns to Databricks best practices&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Unity Catalog setup&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Generating DDL for catalogs, schemas, tables with proper governance&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Troubleshooting &amp;amp; Debugging&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Log analysis&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Parsing Spark UI logs to identify bottlenecks&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cluster configuration recommendations&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Right-sizing clusters based on workload patterns&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cost optimization&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Identifying expensive operations and suggesting alternatives&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Data quality investigation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Root cause analysis for data anomalies&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Documentation &amp;amp; Knowledge Sharing&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Code documentation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Generating docstrings and inline comments&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Runbook creation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Documenting operational procedures&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Onboarding materials&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Creating training content for new team members&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Architecture diagrams&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Describing data flows in markdown/mermaid format&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Metadata &amp;amp; Configuration Management&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Metadata-driven frameworks&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Generating configuration files from templates&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Dynamic SQL generation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Creating parameterized queries from metadata&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Table property management&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Bulk updates to table comments, tags, ownership&lt;/LI&gt;&lt;/UL&gt;&lt;HR /&gt;&lt;H2&gt;&lt;span class="lia-unicode-emoji" title=":direct_hit:"&gt;🎯&lt;/span&gt; Prompt Library Storage Ideas&lt;/H2&gt;&lt;P&gt;I'm considering these approaches for storing standardized prompts.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;What has worked for your team?&lt;/STRONG&gt;&lt;/P&gt;&lt;H3&gt;&lt;STRONG&gt;Option 1: Databricks Repos (Git-backed)&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Store prompts as markdown files in a Git repository&lt;/LI&gt;&lt;LI&gt;Sync to&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Workspace/Shared/prompt-library/&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;using Databricks Repos&lt;/LI&gt;&lt;LI&gt;Version control with PR reviews for quality&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pros&lt;/STRONG&gt;: Version history, easy updates, Git workflow&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cons&lt;/STRONG&gt;: Requires Git familiarity&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Option 2: Workspace Files/Folders&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Create&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Workspace/Shared/prompt-library/&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;with organized subfolders&lt;/LI&gt;&lt;LI&gt;Store prompts as&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;.md&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;or&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;.txt&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;files&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pros&lt;/STRONG&gt;: Simple, no external dependencies, easy copy-paste&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cons&lt;/STRONG&gt;: No version control, manual updates&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Option 3: Notebooks as Templates&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Create template notebooks with prompt examples in markdown cells&lt;/LI&gt;&lt;LI&gt;Include runnable code examples&lt;/LI&gt;&lt;LI&gt;Easy to clone and customize&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pros&lt;/STRONG&gt;: Native Databricks experience, executable examples&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cons&lt;/STRONG&gt;: Harder to version control, not ideal for pure text prompts&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Option 4: Confluence/Wiki Integration&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Centralized documentation with search and categorization&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pros&lt;/STRONG&gt;: Rich formatting, comments, access controls&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cons&lt;/STRONG&gt;: Outside Databricks, copy-paste friction&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Option 5: Python Package (Advanced)&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Build internal package with programmatic prompt access&lt;/LI&gt;&lt;LI&gt;Template rendering with variable injection&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pros&lt;/STRONG&gt;: Programmatic, consistent, validated&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cons&lt;/STRONG&gt;: Development overhead, learning curve&lt;/LI&gt;&lt;/UL&gt;&lt;HR /&gt;&lt;H2&gt;🙋 What I'm Looking For&lt;/H2&gt;&lt;P&gt;&lt;STRONG&gt;From the community:&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;Real-world use cases&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- What tasks do you use AI Assistant for daily?&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Prompt examples&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Can you share prompts that work exceptionally well for data engineering tasks?&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Storage patterns&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- How do you organize and share prompts across your team?&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Best practices&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- What prompt engineering techniques work best for Databricks workflows?&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Limitations&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- What doesn't work well? Where do you still prefer manual coding?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;STRONG&gt;Specific questions:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Has anyone built a&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;prompt library&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for their data engineering team? How did you structure it?&lt;/LI&gt;&lt;LI&gt;Are there&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Databricks-specific prompt patterns&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;that work better than generic ones?&lt;/LI&gt;&lt;LI&gt;How do you handle&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;context limits&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;when working with large codebases or complex schemas?&lt;/LI&gt;&lt;LI&gt;Any tips for&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;teaching prompt engineering&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;to team members new to AI assistants?&lt;/LI&gt;&lt;/UL&gt;&lt;HR /&gt;&lt;H2&gt;&lt;span class="lia-unicode-emoji" title=":rocket:"&gt;🚀&lt;/span&gt; Let's Build Together&lt;/H2&gt;&lt;P&gt;I believe standardizing our AI Assistant usage can significantly boost team productivity and code quality. If there's interest, I'm happy to:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Share my prompt templates and frameworks&lt;/LI&gt;&lt;LI&gt;Collaborate on building a community prompt library&lt;/LI&gt;&lt;LI&gt;Organize a knowledge-sharing session&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Please share your experiences, use cases, and recommendations!&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;Even if you're just getting started with AI Assistant, your perspective is valuable.&lt;/P&gt;&lt;P&gt;Looking forward to learning from this amazing community! &lt;span class="lia-unicode-emoji" title=":raising_hands:"&gt;🙌&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 10 Feb 2026 19:41:23 GMT</pubDate>
    <dc:creator>prasad_dhongade</dc:creator>
    <dc:date>2026-02-10T19:41:23Z</dc:date>
    <item>
      <title>Leveraging AI Assistant in Data Engineering Workflows - Share Your Use Cases &amp;amp; Best Practices</title>
      <link>https://community.databricks.com/t5/data-engineering/leveraging-ai-assistant-in-data-engineering-workflows-share-your/m-p/147939#M52798</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hi Databricks Community,&lt;/P&gt;&lt;P&gt;I've been extensively using the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;AI Assistant in Databricks Workspace&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for my data engineering tasks and have seen significant productivity gains. I'm curious to learn how others are leveraging this capability and explore opportunities to standardize our approaches.&lt;/P&gt;&lt;H2&gt;&lt;span class="lia-unicode-emoji" title=":thinking_face:"&gt;🤔&lt;/span&gt; Questions for the Community&lt;/H2&gt;&lt;H3&gt;1.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;How are you using AI Assistant in your data engineering workflows?&lt;/STRONG&gt;&lt;/H3&gt;&lt;P&gt;I'd love to hear about:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Specific use cases where AI Assistant has been most valuable&lt;/LI&gt;&lt;LI&gt;Tasks that have become significantly faster or easier&lt;/LI&gt;&lt;LI&gt;Workflows you've automated or optimized using AI assistance&lt;/LI&gt;&lt;LI&gt;Any challenges or limitations you've encountered&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;2.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Standardizing Prompts Across Teams&lt;/STRONG&gt;&lt;/H3&gt;&lt;P&gt;My team wants to create a&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;shared prompt library&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;so everyone can benefit from well-crafted prompts. Specifically:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;How can we store standard prompts in Databricks Workspace&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;so they're easily accessible to all team members?&lt;/LI&gt;&lt;LI&gt;What's the best way to organize and version-control these prompts?&lt;/LI&gt;&lt;LI&gt;Are there existing patterns or frameworks (like COIE, RISEN, etc.) that work well for data engineering tasks?&lt;/LI&gt;&lt;LI&gt;How do you ensure prompt quality and consistency across your team?&lt;/LI&gt;&lt;/UL&gt;&lt;HR /&gt;&lt;H2&gt;&lt;span class="lia-unicode-emoji" title=":light_bulb:"&gt;💡&lt;/span&gt; Use Cases I'm Exploring&lt;/H2&gt;&lt;P&gt;Here are some data engineering scenarios where AI Assistant could add value.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Please add your own or share what's worked for you:&lt;/STRONG&gt;&lt;/P&gt;&lt;H3&gt;&lt;STRONG&gt;Code Development &amp;amp; Optimization&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Py Spark/SQL code generation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Converting business logic to optimized Spark code&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Performance tuning&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Analyzing slow queries and suggesting optimizations (broadcast joins, partitioning strategies, caching)&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Code refactoring&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Modernizing legacy code, improving readability, reducing technical debt&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Error diagnosis&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Troubleshooting OOM errors, data skew, shuffle issues&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Unit test generation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Creating test cases for data transformations&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Pipeline Development&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Delta Live Tables (DLT) pipeline creation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Generating DLT syntax from requirements&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Data quality checks&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Writing expectations and validation logic&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Incremental processing patterns&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Implementing CDC, SCD Type 2, merge logic&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Orchestration logic&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Designing workflow dependencies and error handling&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Data Modeling &amp;amp; Architecture&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Schema design&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Suggesting optimal table structures (medallion architecture, Data Vault, Kimball)&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Data lineage documentation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Generating documentation from code&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Migration assistance&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Converting SQL Server/Oracle patterns to Databricks best practices&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Unity Catalog setup&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Generating DDL for catalogs, schemas, tables with proper governance&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Troubleshooting &amp;amp; Debugging&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Log analysis&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Parsing Spark UI logs to identify bottlenecks&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cluster configuration recommendations&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Right-sizing clusters based on workload patterns&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cost optimization&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Identifying expensive operations and suggesting alternatives&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Data quality investigation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Root cause analysis for data anomalies&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Documentation &amp;amp; Knowledge Sharing&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Code documentation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Generating docstrings and inline comments&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Runbook creation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Documenting operational procedures&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Onboarding materials&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Creating training content for new team members&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Architecture diagrams&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Describing data flows in markdown/mermaid format&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Metadata &amp;amp; Configuration Management&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Metadata-driven frameworks&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Generating configuration files from templates&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Dynamic SQL generation&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Creating parameterized queries from metadata&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Table property management&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Bulk updates to table comments, tags, ownership&lt;/LI&gt;&lt;/UL&gt;&lt;HR /&gt;&lt;H2&gt;&lt;span class="lia-unicode-emoji" title=":direct_hit:"&gt;🎯&lt;/span&gt; Prompt Library Storage Ideas&lt;/H2&gt;&lt;P&gt;I'm considering these approaches for storing standardized prompts.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;What has worked for your team?&lt;/STRONG&gt;&lt;/P&gt;&lt;H3&gt;&lt;STRONG&gt;Option 1: Databricks Repos (Git-backed)&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Store prompts as markdown files in a Git repository&lt;/LI&gt;&lt;LI&gt;Sync to&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Workspace/Shared/prompt-library/&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;using Databricks Repos&lt;/LI&gt;&lt;LI&gt;Version control with PR reviews for quality&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pros&lt;/STRONG&gt;: Version history, easy updates, Git workflow&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cons&lt;/STRONG&gt;: Requires Git familiarity&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Option 2: Workspace Files/Folders&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Create&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Workspace/Shared/prompt-library/&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;with organized subfolders&lt;/LI&gt;&lt;LI&gt;Store prompts as&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;.md&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;or&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;.txt&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;files&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pros&lt;/STRONG&gt;: Simple, no external dependencies, easy copy-paste&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cons&lt;/STRONG&gt;: No version control, manual updates&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Option 3: Notebooks as Templates&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Create template notebooks with prompt examples in markdown cells&lt;/LI&gt;&lt;LI&gt;Include runnable code examples&lt;/LI&gt;&lt;LI&gt;Easy to clone and customize&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pros&lt;/STRONG&gt;: Native Databricks experience, executable examples&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cons&lt;/STRONG&gt;: Harder to version control, not ideal for pure text prompts&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Option 4: Confluence/Wiki Integration&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Centralized documentation with search and categorization&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pros&lt;/STRONG&gt;: Rich formatting, comments, access controls&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cons&lt;/STRONG&gt;: Outside Databricks, copy-paste friction&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;&lt;STRONG&gt;Option 5: Python Package (Advanced)&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Build internal package with programmatic prompt access&lt;/LI&gt;&lt;LI&gt;Template rendering with variable injection&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pros&lt;/STRONG&gt;: Programmatic, consistent, validated&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Cons&lt;/STRONG&gt;: Development overhead, learning curve&lt;/LI&gt;&lt;/UL&gt;&lt;HR /&gt;&lt;H2&gt;🙋 What I'm Looking For&lt;/H2&gt;&lt;P&gt;&lt;STRONG&gt;From the community:&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;Real-world use cases&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- What tasks do you use AI Assistant for daily?&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Prompt examples&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- Can you share prompts that work exceptionally well for data engineering tasks?&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Storage patterns&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- How do you organize and share prompts across your team?&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Best practices&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- What prompt engineering techniques work best for Databricks workflows?&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Limitations&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;- What doesn't work well? Where do you still prefer manual coding?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;STRONG&gt;Specific questions:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Has anyone built a&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;prompt library&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for their data engineering team? How did you structure it?&lt;/LI&gt;&lt;LI&gt;Are there&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Databricks-specific prompt patterns&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;that work better than generic ones?&lt;/LI&gt;&lt;LI&gt;How do you handle&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;context limits&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;when working with large codebases or complex schemas?&lt;/LI&gt;&lt;LI&gt;Any tips for&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;teaching prompt engineering&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;to team members new to AI assistants?&lt;/LI&gt;&lt;/UL&gt;&lt;HR /&gt;&lt;H2&gt;&lt;span class="lia-unicode-emoji" title=":rocket:"&gt;🚀&lt;/span&gt; Let's Build Together&lt;/H2&gt;&lt;P&gt;I believe standardizing our AI Assistant usage can significantly boost team productivity and code quality. If there's interest, I'm happy to:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Share my prompt templates and frameworks&lt;/LI&gt;&lt;LI&gt;Collaborate on building a community prompt library&lt;/LI&gt;&lt;LI&gt;Organize a knowledge-sharing session&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Please share your experiences, use cases, and recommendations!&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;Even if you're just getting started with AI Assistant, your perspective is valuable.&lt;/P&gt;&lt;P&gt;Looking forward to learning from this amazing community! &lt;span class="lia-unicode-emoji" title=":raising_hands:"&gt;🙌&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Feb 2026 19:41:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/leveraging-ai-assistant-in-data-engineering-workflows-share-your/m-p/147939#M52798</guid>
      <dc:creator>prasad_dhongade</dc:creator>
      <dc:date>2026-02-10T19:41:23Z</dc:date>
    </item>
  </channel>
</rss>

