<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Migrating From Azure to Databricks in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/120984#M46297</link>
    <description>&lt;P&gt;Thanks LRALVA. May you please help me with cost part. I am not have Prod level costing knowledge. Thanks.&lt;/P&gt;</description>
    <pubDate>Thu, 05 Jun 2025 04:28:38 GMT</pubDate>
    <dc:creator>Pratikmsbsvm</dc:creator>
    <dc:date>2025-06-05T04:28:38Z</dc:date>
    <item>
      <title>Migrating From Azure to Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/120792#M46248</link>
      <description>&lt;P&gt;Hi Techie,&lt;/P&gt;&lt;P&gt;May someone please help me with Pros and Cons from migrating my Realtime streaming solution from Azure to Databricks.&lt;/P&gt;&lt;P&gt;which component I can replaced with Databricks and what benefit I can get out of it.&lt;/P&gt;&lt;P&gt;Current Architecture:-&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="HLD.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/17307i3BA580477BDE64F3/image-size/large?v=v2&amp;amp;px=999" role="button" title="HLD.png" alt="HLD.png" /&gt;&lt;/span&gt;&amp;nbsp;Many Thanks&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jun 2025 07:32:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/120792#M46248</guid>
      <dc:creator>Pratikmsbsvm</dc:creator>
      <dc:date>2025-06-03T07:32:24Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating From Azure to Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/120865#M46264</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/143693"&gt;@Pratikmsbsvm&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Looking at your current Azure streaming architecture, I can help you understand the pros and cons of migrating to Databricks. Let me break this down by component and overall considerations:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Components That Can Be Replaced with Databricks&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Azure Stream Analytics → Databricks Structured Streaming&lt;/STRONG&gt;&lt;BR /&gt;- What changes: Replace ASA with Spark Structured Streaming in Databricks&lt;BR /&gt;- Benefits: More flexible transformations, custom logic, ML integration, better debugging&lt;BR /&gt;- Considerations: Requires more development effort, need Spark expertis&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Azure Data Lake Gen2 → Databricks Delta Lake:&lt;/STRONG&gt;&lt;BR /&gt;- What changes: Use Delta Lake format on your existing ADLS Gen2 storage&lt;BR /&gt;- Benefits: ACID transactions, time travel, schema evolution, better performance&lt;BR /&gt;- Considerations: Delta Lake works great with ADLS Gen2, minimal migration needed&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Azure SQL Database → Databricks SQL + Delta Lake&lt;/STRONG&gt;&lt;BR /&gt;- What changes: Move analytical workloads to Delta Lake, keep transactional data in SQL DB&lt;BR /&gt;- Benefits: Better performance for analytics, unified data platform&lt;BR /&gt;- Considerations: May still need SQL DB for transactional systems&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;Pros of Migration to Databricks&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Technical Benefits&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Unified Platform: Single platform for streaming, batch, ML, and analytics&lt;BR /&gt;Advanced Analytics:&lt;BR /&gt;- Built-in ML capabilities, easy model deployment&lt;BR /&gt;- Better Performance: Optimized Spark engine, Delta Lake optimizations&lt;BR /&gt;- Flexibility: Custom transformations, complex event processing&lt;BR /&gt;- Scalability: Auto-scaling clusters, better resource utilization&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Operational Benefits&lt;/STRONG&gt;&lt;BR /&gt;- Simplified Architecture: Fewer moving parts, unified monitoring&lt;BR /&gt;- Cost Optimization: Pay-per-use model, automatic cluster termination&lt;BR /&gt;- Developer Productivity: Notebooks, collaborative environment, version control&lt;BR /&gt;- Data Governance: Unity Catalog for centralized metadata and security&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Cons of Migration to Databricks&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Complexity &amp;amp; Skills&lt;/STRONG&gt;&lt;BR /&gt;- Learning Curve: Team needs Spark/Python/Scala expertise&lt;BR /&gt;- Development Overhead: More complex than drag-and-drop ASA&lt;BR /&gt;- Debugging: Streaming jobs can be harder to troubleshoot&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Operational Challenges&lt;/STRONG&gt;&lt;BR /&gt;- Monitoring: Need to set up comprehensive monitoring for Spark jobs&lt;BR /&gt;- Latency: May have slightly higher latency than ASA for simple transformations&lt;BR /&gt;- Maintenance: More infrastructure to manage and tune&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Cost Considerations&lt;/STRONG&gt;&lt;BR /&gt;- Compute Costs: Can be higher if not properly optimized&lt;BR /&gt;- Learning Investment: Time and training costs for team upskilling.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Migration Strategy Recommendations:&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Hybrid Approach (Recommended)&lt;/STRONG&gt;&lt;BR /&gt;1. Keep: Event Hubs, ADLS Gen2, existing applications&lt;BR /&gt;2. Replace: Stream Analytics with Databricks Structured Streaming&lt;BR /&gt;3. Enhance: Add Delta Lake format, ML capabilities&lt;BR /&gt;4. Gradual: Migrate one pipeline at a time&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Components to Retain&lt;/STRONG&gt;&lt;BR /&gt;- Azure Event Hubs: Excellent integration with Databricks&lt;BR /&gt;- ADLS Gen2: Works perfectly with Databricks Delta Lake&lt;BR /&gt;- Power BI: Native integration with Databricks SQL&lt;BR /&gt;- Existing Applications: Can connect to Databricks via JDBC/REST APIs&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;When Migration Makes Sense&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Migrate if you need:&lt;/STRONG&gt;&lt;BR /&gt;- Complex transformations or custom business logic&lt;BR /&gt;- Real-time ML inference&lt;BR /&gt;- Advanced analytics capabilities&lt;BR /&gt;- Better cost optimization for large-scale processing&lt;BR /&gt;- Unified platform for multiple data workloads&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Stay with current setup if:&lt;/STRONG&gt;&lt;BR /&gt;- Simple aggregations and transformations are sufficient&lt;BR /&gt;- Team lacks Spark expertise and timeline is tight&lt;BR /&gt;- Current solution meets all performance requirements&lt;BR /&gt;- Minimal budget for platform changes.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jun 2025 23:57:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/120865#M46264</guid>
      <dc:creator>lingareddy_Alva</dc:creator>
      <dc:date>2025-06-03T23:57:10Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating From Azure to Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/120984#M46297</link>
      <description>&lt;P&gt;Thanks LRALVA. May you please help me with cost part. I am not have Prod level costing knowledge. Thanks.&lt;/P&gt;</description>
      <pubDate>Thu, 05 Jun 2025 04:28:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/120984#M46297</guid>
      <dc:creator>Pratikmsbsvm</dc:creator>
      <dc:date>2025-06-05T04:28:38Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating From Azure to Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/121085#M46331</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/143693"&gt;@Pratikmsbsvm&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Stream Analytics has no upfront costs - you only pay for the streaming&lt;BR /&gt;units you consume with no commitment or cluster provisioning required.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Databricks Cost Structure&lt;/STRONG&gt;&lt;BR /&gt;Two-Layer Pricing Model&lt;BR /&gt;1. Azure VM Compute Costs (what you pay Azure)&lt;BR /&gt;2. Databricks Units (DBUs) (what you pay Databricks)&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Typical Costs for Streaming Workloads&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Small to Medium Streaming Job:&lt;/STRONG&gt;&lt;BR /&gt;- VM Costs: $200-500/month (Standard_DS3_v2 cluster)&lt;BR /&gt;- DBU Costs: $300-800/month (depending on tier and usage)&lt;BR /&gt;- Total: $500-1,300/month&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Large Streaming Job:&lt;/STRONG&gt;&lt;BR /&gt;- VM Costs: $800-2,000/month (larger clusters)&lt;BR /&gt;- DBU Costs: $1,000-3,000/month&lt;BR /&gt;- Total: $1,800-5,000/month&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Cost Optimization Strategies for Databricks&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;1. Cluster Optimization&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;# Use spot instances (60-90% cost savings)&lt;BR /&gt;"azure_attributes.availability": "SPOT_WITH_FALLBACK_AZURE"&lt;/P&gt;&lt;P&gt;# Auto-termination to avoid idle costs&lt;BR /&gt;"autotermination_minutes": 30&lt;/P&gt;&lt;P&gt;# Right-size clusters based on workload&lt;BR /&gt;"autoscale": {"min_workers": 2, "max_workers": 8}&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;2. Workload Optimization&lt;/STRONG&gt;&lt;BR /&gt;- Batch vs Streaming: Use batch processing where real-time isn't critical&lt;BR /&gt;- Resource Pooling: Share clusters across multiple workloads&lt;BR /&gt;- Delta Lake: Reduce storage costs with compression and optimization&lt;BR /&gt;&lt;STRONG&gt;3. Pricing Tier Selection&lt;/STRONG&gt;&lt;BR /&gt;- Standard: For basic streaming workloads&lt;BR /&gt;- Premium: Only if you need advanced security/governance&lt;BR /&gt;- Consider Reserved Instances: For predictable workloads&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Break-Even Analysis:&lt;/STRONG&gt;&lt;BR /&gt;When Databricks Becomes Cost-Effective:&lt;BR /&gt;You'll likely save money with Databricks if:&lt;BR /&gt;- You're running 15+ streaming units in ASA&lt;BR /&gt;- You need complex transformations (reducing development time)&lt;BR /&gt;- You're already planning ML/advanced analytics initiatives&lt;BR /&gt;- You can consolidate multiple ASA jobs into shared Databricks clusters&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Recommendation&lt;/STRONG&gt;&lt;BR /&gt;Start Small: Begin with a pilot migration of your most complex streaming job to Databricks while keeping simple aggregations in ASA.&lt;BR /&gt;This hybrid approach lets you:&lt;BR /&gt;- Compare actual costs vs projections&lt;BR /&gt;- Build team expertise gradually&lt;BR /&gt;- Minimize migration risk&lt;BR /&gt;- Optimize for the best cost/benefit ratio per workload&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Jun 2025 17:37:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/121085#M46331</guid>
      <dc:creator>lingareddy_Alva</dc:creator>
      <dc:date>2025-06-05T17:37:42Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating From Azure to Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/121120#M46343</link>
      <description>&lt;P&gt;I completely agree with&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/24053"&gt;@lingareddy_Alva&lt;/a&gt;&amp;nbsp;on the costing part. One small point I would like to mention is We should only enable &lt;SPAN&gt;SPOT instances (60-90% cost savings) in Development/non-critical(PROD) environment. This option works great and is indeed cost effective but not good for mission critical workloads. I used this for one of my daily load and sometimes the process terminates abruptly.&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/24053"&gt;@lingareddy_Alva&lt;/a&gt;&amp;nbsp;please correct me if I am wrong here.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 06 Jun 2025 08:06:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/121120#M46343</guid>
      <dc:creator>vaibhavs120</dc:creator>
      <dc:date>2025-06-06T08:06:59Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating From Azure to Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/121144#M46353</link>
      <description>&lt;P&gt;I agree with you&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/155343"&gt;@vaibhavs120&lt;/a&gt;&amp;nbsp;, thanks for bringing this up.&lt;/P&gt;</description>
      <pubDate>Fri, 06 Jun 2025 15:25:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/migrating-from-azure-to-databricks/m-p/121144#M46353</guid>
      <dc:creator>lingareddy_Alva</dc:creator>
      <dc:date>2025-06-06T15:25:16Z</dc:date>
    </item>
  </channel>
</rss>

