AWS SageMaker to the Azure Databricks.

thewfhengineer
New Contributor III

I'm starting a project to migrate our Compliance model - Python code (Pandas-based) from AWS SageMaker to the Azure ecosystem.

Source: AWS (SageMaker, Airflow)
Target: Azure (Databricks, ADLS)

I'm evaluating the high-level approach and would appreciate your guidance. The core options I'm considering are:

  1. Lift & Shift: Minimal code changes, focusing on getting it running on Databricks quickly.

  2. Refactor & Modernize: Adapting the code to leverage native Azure/Databricks capabilities (like Spark) for better long-term performance.

What are your thoughts on the best path forward?

Can you share your experiance ?

#sagemaker #aws #migration

mark_ott
Databricks Employee
Databricks Employee

For migrating a Python/Pandas-based compliance model from AWS SageMaker/Airflow to Azure Databricks/ADLS, the best approach depends on priorities like speed, risk, cost, and future scalability. Both "Lift & Shift" and "Refactor & Modernize" have clear trade-offs that should be evaluated in the context of your business needs and long-term goals.

Lift & Shift

  • Pros

    • Fastest migration: You can quickly get your existing code running with minimal changes, allowing for rapid testing and deployment on Azure Databricks.

    • Lower immediate risk and effort: Familiarity with the existing codebase makes troubleshooting easier.

    • Simplified rollback: If issues arise, you can revert to AWS more easily since core functionality remains unchanged.

  • Cons

    • Limited scalability: Pandas operations are in-memory, which doesn't leverage Spark's distributed computing.

    • Higher long-term operational costs: Databricks pricing is better optimized for Spark workloads, not for Pandas jobs.

    • Missed Azure-native optimizations: You won’t benefit from performance improvements and integrations available with Spark, ADLS, or MLflow.

Refactor & Modernize

  • Pros

    • Performance boost: Migrating Pandas code to PySpark or Koalas unlocks better scalability for large datasets through distributed processing.

    • Enhanced maintainability: Aligning with Databricks best practices and Azure-native features supports future-proofing and easier integration with ADLS, Delta Lake, and MLflow.

    • Cost-effectiveness at scale: More efficient resource utilization on Databricks and easier management of data pipelines.

  • Cons

    • Increased migration time and complexity: Significant code rewrites, team upskilling, and validation effort required.

    • Higher upfront investment: More design, testing, and documentation compared to Lift & Shift, delaying deployment and ROI.

Recommendation

If your goal is rapid migration and validation, Lift & Shift is reasonable for initial deployment, especially if you’re constrained by time or resources. However, for sustainable operations and scalability, Refactor & Modernize offers better long-term value by fully leveraging Azure Databricks and ADLS capabilities. Many organizations blend both strategies: start with Lift & Shift for quick wins, then iteratively refactor critical components to Spark/Delta Lake as business needs grow.

Key Advice

  • Assess data size, workflow complexity, and compliance requirements before deciding.

  • Identify high-value pipelines or bottlenecks for targeted modernization.

  • Plan for change management, team training, and thorough testing.

  • Document the migration approach for auditability and future optimization.

By balancing immediate needs with long-term strategy, you can minimize disruption while maximizing your investment in Azure’s ecosystem.

View solution in original post