- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-13-2024 07:27 AM
Use case and context:
We have a databricks workspace in a specific region, reading and writing files from/to the same region.
We also read from a Shared Catalog in a different company, a data provider, which is pointing to multi-region s3 buckets.
The result is that we are incurring in high NATGateway-Bytes and DataTransfer-Regional-Bytes bills.
Measures that we took to reduce cost:
Implemented a S3 Gateway Endpoint, to route any traffic between instances managed by databricks in private subnets and S3 in the same region. The idea is that this should reduce cost while reading and writing to our S3 in the same region, and reading from the shared catalog pointing to multiregion buckets, but we are still seeing no reduction on NATGateway-Bytes and DataTransfer-Regional-Bytes costs.
Are these costs inevitable? What could be wrong in our networking setup? Is there any other alternative?