Databricks Community

antonionuzzo · 2 weeks ago

Hello,

I need to develop some ML use case. I would like to understand if the serverless functionality unlocks any additional features or if it is mandatory for certain capabilities.

Thank you!

BigRoux · 2 weeks ago

Serverless functionality in Databricks is not mandatory for utilizing machine learning (ML) capabilities. However, it does unlock specific benefits and features that can enhance certain workflows. Here’s how serverless compute can add value, based on the context:

Performance and Scalability:
- Serverless compute allows for fast startup times and automatic scalability, which is particularly useful for ML workloads involving exploratory experiments or interactive use cases where efficiency is key.
Cost Optimization:
- Serverless compute operates in a cost-optimized mode for workflows, notebooks, and Delta Live Tables, reducing costs when resources are not actively in use. This can particularly benefit intermittent ML workloads.
Enhanced Security and Governance:
- Serverless environments include enhanced security features, such as shared security access modes and Unity Catalog integration, which support secure and compliant ML workflows.
Separating Responsibility:
- Serverless eliminates the need for manually provisioning and managing clusters, allowing data scientists and ML practitioners to focus entirely on their work without requiring support from infrastructure teams.
Developing and Managing ML Models:
- While serverless compute supports ML model development and deployment, limitations exist for workloads requiring GPUs, certain ML runtime features, or custom data sources. However, Databricks MLtools like MLflow can still be leveraged effectively within serverless environments for experiment tracking and deployment.
Limitations:
- Specific functionality like Spark UI debugging, certain Spark configurations, and support for GPUs or cluster-scoped libraries (e.g., .jar files) is limited in serverless environments. Ensure these constraints align with your ML use case.

Serverless compute is beneficial but not mandatory for most Databricks ML workflows.

View solution in original post

BigRoux · 2 weeks ago

Serverless functionality in Databricks is not mandatory for utilizing machine learning (ML) capabilities. However, it does unlock specific benefits and features that can enhance certain workflows. Here’s how serverless compute can add value, based on the context:

Performance and Scalability:
- Serverless compute allows for fast startup times and automatic scalability, which is particularly useful for ML workloads involving exploratory experiments or interactive use cases where efficiency is key.
Cost Optimization:
- Serverless compute operates in a cost-optimized mode for workflows, notebooks, and Delta Live Tables, reducing costs when resources are not actively in use. This can particularly benefit intermittent ML workloads.
Enhanced Security and Governance:
- Serverless environments include enhanced security features, such as shared security access modes and Unity Catalog integration, which support secure and compliant ML workflows.
Separating Responsibility:
- Serverless eliminates the need for manually provisioning and managing clusters, allowing data scientists and ML practitioners to focus entirely on their work without requiring support from infrastructure teams.
Developing and Managing ML Models:
- While serverless compute supports ML model development and deployment, limitations exist for workloads requiring GPUs, certain ML runtime features, or custom data sources. However, Databricks MLtools like MLflow can still be leveraged effectively within serverless environments for experiment tracking and deployment.
Limitations:
- Specific functionality like Spark UI debugging, certain Spark configurations, and support for GPUs or cluster-scoped libraries (e.g., .jar files) is limited in serverless environments. Ensure these constraints align with your ML use case.