06-03-2025 12:22 AM
Hi everyone, this is my first post here. I'm doing my best to write in English, so I apologize if anything is unclear.
I'm looking to understand the best practices for how many environments to set up when using Databricks on AWS. I'm considering the following patterns and would appreciate your thoughts on which configuration is most appropriate:
Also, if any of these configurations are not technically feasible, I’d appreciate it if you could point that out as well.
Thanks in advance!
06-09-2025 03:02 AM
Hey @r_w_
Let me share my experience, since I’ve worked with different clients using Databricks on AWS, and I’ve seen several setups in practice:
Option A: Multiple Databricks accounts and multiple AWS accounts
This model offers the highest level of isolation. Each environment lives in its own Databricks and AWS account, allowing for complete separation of resources, users, and billing. It’s good if you are a large organization. But it’s also the most expensive and complex to maintain, since it involves duplicating configurations, user management, pipelines, and access control. I wouldn’t recommend this option unless you have very strong isolation or compliance requirements.
Option B: A single Databricks account with multiple workspaces and multiple AWS accounts
It enables environment separation at the infrastructure level using different AWS accounts, while still maintaining centralized management of workspaces under a single Databricks account. It strikes a balance between isolation and operational efficiency. I would recommend this setup because it’s an easy way to work: you just need to manage cross-account access in AWS, and everything stays centralized on the Databricks side.
Option C: A single Databricks account with multiple workspaces and a single AWS account
This setup is common in small to medium companies. Isolation is achieved through separate networks (e.g., different subnets or VPCs), IAM policies, and logical separation of data using S3 buckets or Unity Catalog. It’s a simple, effective, and easy-to-manage approach. While it doesn’t offer the same level of isolation as the previous options, it’s often more than enough for most use cases.
If you’re a small company and don’t have very strict isolation needs, I’d go with this option. Just keep in mind that, since everything lives in a single AWS account, you’ll need to clearly separate what’s dev/pre/pro and put extra care into access policies to ensure everything is properly segmented.
Hope this helps, 🙂
Isi
06-03-2025 01:51 AM
It’s about setting up multiple environments such as development, testing, and production.
06-09-2025 03:02 AM
Hey @r_w_
Let me share my experience, since I’ve worked with different clients using Databricks on AWS, and I’ve seen several setups in practice:
Option A: Multiple Databricks accounts and multiple AWS accounts
This model offers the highest level of isolation. Each environment lives in its own Databricks and AWS account, allowing for complete separation of resources, users, and billing. It’s good if you are a large organization. But it’s also the most expensive and complex to maintain, since it involves duplicating configurations, user management, pipelines, and access control. I wouldn’t recommend this option unless you have very strong isolation or compliance requirements.
Option B: A single Databricks account with multiple workspaces and multiple AWS accounts
It enables environment separation at the infrastructure level using different AWS accounts, while still maintaining centralized management of workspaces under a single Databricks account. It strikes a balance between isolation and operational efficiency. I would recommend this setup because it’s an easy way to work: you just need to manage cross-account access in AWS, and everything stays centralized on the Databricks side.
Option C: A single Databricks account with multiple workspaces and a single AWS account
This setup is common in small to medium companies. Isolation is achieved through separate networks (e.g., different subnets or VPCs), IAM policies, and logical separation of data using S3 buckets or Unity Catalog. It’s a simple, effective, and easy-to-manage approach. While it doesn’t offer the same level of isolation as the previous options, it’s often more than enough for most use cases.
If you’re a small company and don’t have very strict isolation needs, I’d go with this option. Just keep in mind that, since everything lives in a single AWS account, you’ll need to clearly separate what’s dev/pre/pro and put extra care into access policies to ensure everything is properly segmented.
Hope this helps, 🙂
Isi
06-09-2025 07:46 PM
Hello, @Isi
Thank you for your response. I understand that there is no definitive best practice and that we need to approach this flexibly depending on the requirements, company size, and complexity of the setup.
If you happen to know, I would also appreciate it if you could tell me the following:
When purchasing Databricks on AWS via the AWS Console Marketplace, my understanding is that the charges are tied to that specific AWS account.
For example, in the case of Option B, would the charges for multiple workspaces all be tied to the AWS account that was initially used to purchase from the Marketplace? Or is there another way to handle this?
I’d appreciate it if you could share any insights for reference.
06-10-2025 06:12 AM
Hello @r_w_
Yes, when you purchase Databricks on AWS through the Marketplace, the charges are tied to the AWS account used to make the purchase. This account becomes the billing account, and all workspaces created under that subscription are associated with that same billing relationship.
Databricks uses a pay-as-you-go model, meaning you are billed based on actual resource usage (such as compute time, DBUs, storage, etc.).
Within the Databricks platform, you also have access to usage monitoring tools, including:
Billable usage logs: Exportable to S3 or directly viewable via the Databricks UI.
Workspace-level dashboards: See usage broken down by job, user, cluster, or SQL warehouse.
Hope this helps 🙂
Isi
06-17-2025 06:42 PM
Thank you for your reply.
I now clearly understand how billing works when purchasing Databricks through the Marketplace.
I appreciate your support and will reach out again if I have any further questions.
06-21-2025 07:47 AM
Hey @r_w_
If you think my answer was correct, it would be great if you could mark it as a solution to help future users 🙂
Thanks,
Isi
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now