Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a week ago
Context: We're exploring using Kubernetes (EKS) as our compute infrastructure instead of Databricks managed clusters. We want to understand if Databricks can orchestrate, deploy, and monitor jobs that run on a Kubernetes cluster.
Questions:
- Is it possible to configure Databricks Jobs to run workloads on an external Kubernetes cluster?
- Can Databricks manage the job lifecycle (submit, monitor, track logs) on Kubernetes pods?
- What's the recommended way to use Databricks as an orchestration layer for Kubernetes workloads?
What we want to do:
- Write Databricks jobs/notebooks
- Have those jobs execute as pods in our Kubernetes cluster (not Databricks-managed clusters)
- Monitor job execution and logs from the Databricks UI
- Keep using Databricks features (MLflow, Delta Lake) for data and model management
Why this is important:
- Scalability: Kubernetes provides better horizontal scaling for large workloads
- Faster cluster initialization: Pre-running EKS cluster eliminates spin-up delays
- Cost efficiency: EKS with spot instances is more cost-effective than always-on Databricks clusters
- Multi-tenant support: Share compute infrastructure across different teams and workloads
Labels:
- Labels:
-
Automl
-
Feature Store
-
Model Serving