cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks setup/deployment checklist/best practices

Phani1
Valued Contributor II

Hi Team, could you please share or guide us on any checklist/best practices for Databricks setup/deployment?

2 REPLIES 2

Phani1
Valued Contributor II

The Databricks platform is on azure

icyflame92
New Contributor II

Hi @Phani1 , here are some best practices https://github.com/Azure/AzureDatabricksBestPractices/tree/master and you could take these points as your "checklist".

  1. Choose the right Databricks Workspace:

    • Decide on the appropriate Azure region for your Databricks workspace.
    • Consider using the Azure Portal to create the Databricks workspace or use infrastructure-as-code tools like ARM templates or Terraform.
  2. Authentication and Authorization:

    • Integrate Databricks with Azure Active Directory (Azure AD) for authentication.
    • Implement role-based access control (RBAC) to manage authorization.
  3. Networking and Security:

    • Configure Azure Virtual Network (VNet) settings for Databricks clusters.
    • Utilize Azure Network Security Groups (NSGs) for firewall rules.
    • Consider Private Link for enhanced security.
  4. Use Azure Key Vault for Secrets:

    • Store sensitive information such as API keys, passwords, and tokens in Azure Key Vault.
    • Integrate Databricks with Azure Key Vault for secure access to secrets.
  5. Cluster Configuration:

    • Leverage Azure Databricks Autoscaling for dynamic resource allocation.
    • Integrate Databricks clusters with Azure Virtual Network for enhanced security.
  6. Data Storage and Integration:

    • Use Azure Data Lake Storage (ADLS) or Azure Blob Storage for data storage.
  7. Logging and Monitoring:

    • Configure Azure Monitor for logging and monitoring.
    • Utilize Azure Log Analytics for centralized log storage and analysis.
  8. Azure Databricks Jobs:

    • Schedule jobs using Azure Databricks Jobs for automated execution.
    • Use Azure Data Factory for orchestrating ETL workflows if needed.
  9. Azure Databricks Delta Lake:

    • Consider using Delta Lake for efficient storage, management, and processing of big data.
    • Utilize Delta Lake for ACID transactions and schema evolution.
  10. Azure DevOps Integration:

    • Integrate Databricks with Azure DevOps for continuous integration and deployment.
    • Automate deployments using Azure DevOps pipelines.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group