cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Multi Customer setup

Kroy
Contributor

We are trying to do POC to have shared resource like compute across multiple customer, Storage will be different, Is this possible ? 

Kroy_0-1702375949921.png

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz_Fatma
Community Manager
Community Manager

Hi @Kroy , When it comes to shared compute resources in Databricks, there are some best practices and options you can consider:

 

Shared Access Mode for Clusters:

All-Purpose Clusters:

Photon for Faster Queries:

Cluster Sizing Considerations:

  • When creating clusters, choose the size of nodes and the number of workers based on the specific operations your workload performs.
  • For example, if you expect frequent shuffles, using a large single node might be more efficient than multiple smaller nodes.
  • Run VACUUM on a cluster with autoscaling set for 1-4 workers, where each worker has 8 cores. Adjust the driver size if you encounter out-of-memory errors during the vacuum process.

Job Clusters for Operationalization:

  • Once youโ€™ve completed development and are ready to operationalize your code, switch to running it on job clusters.
  • Job clusters terminate when the job ends, reducing resource usage and costs. They are ideal for orchestrated tasks.

Delta Sharing for Data Sharing:

Remember that while shared compute resources are possible, youโ€™ll need to carefully plan and configure your clusters based on your specific requirements and use cases.

 

 Databricks provides flexibility, and with the right choices, you can achieve efficient resource utilization across multiple customers. ๐Ÿš€

View solution in original post

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @Kroy , When it comes to shared compute resources in Databricks, there are some best practices and options you can consider:

 

Shared Access Mode for Clusters:

All-Purpose Clusters:

Photon for Faster Queries:

Cluster Sizing Considerations:

  • When creating clusters, choose the size of nodes and the number of workers based on the specific operations your workload performs.
  • For example, if you expect frequent shuffles, using a large single node might be more efficient than multiple smaller nodes.
  • Run VACUUM on a cluster with autoscaling set for 1-4 workers, where each worker has 8 cores. Adjust the driver size if you encounter out-of-memory errors during the vacuum process.

Job Clusters for Operationalization:

  • Once youโ€™ve completed development and are ready to operationalize your code, switch to running it on job clusters.
  • Job clusters terminate when the job ends, reducing resource usage and costs. They are ideal for orchestrated tasks.

Delta Sharing for Data Sharing:

Remember that while shared compute resources are possible, youโ€™ll need to carefully plan and configure your clusters based on your specific requirements and use cases.

 

 Databricks provides flexibility, and with the right choices, you can achieve efficient resource utilization across multiple customers. ๐Ÿš€

do you know if we can create a single catalog and multiple metastore

@Kaniz_Fatma 

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group