ā11-17-2025 08:56 PM
Iām new to writing requirement definitions, and Iād like to ask a question about interface (I/F) security.
My question is:
Do I need to define the authentication and security mechanisms (such as OAuth2, Managed Identity, Service Principals, etc.) between the systems shown below? Or do I also need to define security between the bronze, silver, and gold layers within the lakehouse?
Our data pipeline is:
VPC on AWS (client system) ā S3 ā Lakehouse (bronze ā silver ā gold) ā Serverless compute
ā11-18-2025 02:40 AM
| Interface or Layer | Should You Define Security? | Typical Mechanisms | Reference Links |
| VPC ā S3 | Yes | IAM roles, service accounts, credentials, policies | https://docs.aws.amazon.com/AmazonS3/latest/userguide/security-best-practices.html |
| S3 ā Lakehouse | Yes | Service principals, managed identities, access keys | https://docs.databricks.com/security/access-control/service-principals.html |
| Lakehouse Bronze ā Silver ā Gold | Sometimes (Context-Driven) | Platform roles, catalog permissions, ACLs, data masking | https://docs.databricks.com/data-governance/unity-catalog/index.html |
| Lakehouse ā Serverless Compute | Yes | Managed identities, OAuth2, tokens, ACLs | https://learn.microsoft.com/en-us/azure/architecture/serverless/security-serverless-applications |
ā11-18-2025 02:53 AM
I'll try to summarize and go directly to the key points as I see this:
- Client to S3 š SAS Token or OAUTH 2.0 with Service to Service authentication (preferred)
- Databricks to S3 š Use Service Principal or Managed Identities (preferred)
- Bronze/Silver/Gold š Create different catalogs per layer or different schemas/databases per catalog to place bronze, silver and gold layers. All of them under Unity Catalog governance. Then, you can set proper permissions for users, groups or service principals depending on layer they should be allowed to interact with.
- Serverless cluster š You can set in "permissions" who can access and how. Establish as needed.
ā11-18-2025 02:40 AM
| Interface or Layer | Should You Define Security? | Typical Mechanisms | Reference Links |
| VPC ā S3 | Yes | IAM roles, service accounts, credentials, policies | https://docs.aws.amazon.com/AmazonS3/latest/userguide/security-best-practices.html |
| S3 ā Lakehouse | Yes | Service principals, managed identities, access keys | https://docs.databricks.com/security/access-control/service-principals.html |
| Lakehouse Bronze ā Silver ā Gold | Sometimes (Context-Driven) | Platform roles, catalog permissions, ACLs, data masking | https://docs.databricks.com/data-governance/unity-catalog/index.html |
| Lakehouse ā Serverless Compute | Yes | Managed identities, OAuth2, tokens, ACLs | https://learn.microsoft.com/en-us/azure/architecture/serverless/security-serverless-applications |
ā11-18-2025 02:53 AM
I'll try to summarize and go directly to the key points as I see this:
- Client to S3 š SAS Token or OAUTH 2.0 with Service to Service authentication (preferred)
- Databricks to S3 š Use Service Principal or Managed Identities (preferred)
- Bronze/Silver/Gold š Create different catalogs per layer or different schemas/databases per catalog to place bronze, silver and gold layers. All of them under Unity Catalog governance. Then, you can set proper permissions for users, groups or service principals depending on layer they should be allowed to interact with.
- Serverless cluster š You can set in "permissions" who can access and how. Establish as needed.