cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

De facto Standard for Databricks on AWS

Kousuke_0716
New Contributor

Hello,

I am working on creating an architecture diagram for Databricks on AWS.
I would like to adopt the de facto standard used by enterprises. Based on my research, I have identified the following components:

  • Network: Customer-managed VPC,Secure Cluster Connectivity (SCC)
  • Data Storage: Delta Lake (S3)
  • Data Catalog: Unity Catalog
  • Data Pipeline: FiveTran (ETL), dbt (Data Transformation)
  • Query Engine: Photon (SQL Acceleration)
  • Security: IAM + Unity Catalog (RBAC)
  • Monitoring & Operations: AWS CloudWatch,Databricks Audit Logs
  • If there are any other important aspects I should consider, please let me know.

Thank you!

1 REPLY 1

-werners-
Esteemed Contributor III

I would not call it a 'standard' but a possible architecture.  The great thing about the cloud is you can complete the puzzle in many ways and make it as complex or as easy as possible.

Also I would not consider Fivetran to be standard in companies.  It is pretty expensive and there are a lot of alternatives available at lower cost (but perhaps a tad more work).

For transformation, what about Databricks itself?
You also need orchestration (or perhaps that is what you mean by pipelines).
The whole Machine Learning part is skipped, so you might wanna look into that.
And what about devops/cicd?

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now