<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Networking reduction cost for NATGateway and Shared Catalog in Administration &amp; Architecture</title>
    <link>https://community.databricks.com/t5/administration-architecture/networking-reduction-cost-for-natgateway-and-shared-catalog/m-p/61368#M911</link>
    <description>&lt;P&gt;Thanks &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;for all the suggestions.&lt;/P&gt;&lt;P&gt;After some days of monitoring NAT cost, I realized that the implementation of the S3 Gateway Endpoint it was actually working, the problem was that I thought that this change would be reflected right away in terms of costs, but I found out that this can take a bit more than 24 hours to be visible in AWS Cost Explorer.&lt;/P&gt;&lt;P&gt;From AWS docs:&amp;nbsp;&lt;A href="https://docs.aws.amazon.com/cost-management/latest/userguide/ce-exploring-data.html" target="_blank"&gt;https://docs.aws.amazon.com/cost-management/latest/userguide/ce-exploring-data.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;All costs reflect your usage up to the previous day. For example, if today is December 2, the data includes your usage through December 1.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We already had AWS Flow Logs implemented in the VPC, so using the following query in Cloudwatch Logs Insight, I saw some reduction the first day, but I wasn't sure if it was a real reduction, or just casual less traffic:&lt;/P&gt;&lt;PRE&gt;# downloads in total&lt;BR /&gt;filter (dstAddr like '10.0.0.1' and not isIpv4InSubnet(srcAddr, '10.0.0.0/16')) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;PRE&gt;# uploads in total&lt;BR /&gt;filter (srcAddr like '10.0.0.1' and not isIpv4InSubnet(dstAddr, '10.0.0.0/16')) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;So I needed to confirm that actually all inbound/outbound traffic between the subnets and S3 was going through the S3 Gateway Endpoint. After some research I found all AWS IP ranges here &lt;A href="https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html" target="_blank"&gt;https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html&lt;/A&gt;, then using this simple script to get only the S3 IP ranges:&lt;/P&gt;&lt;PRE&gt;import json&lt;BR /&gt;&lt;BR /&gt;# Load the AWS IP ranges JSON file&lt;BR /&gt;with open('aws-ips.json') as file:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;ip_ranges = json.load(file)&lt;BR /&gt;&lt;BR /&gt;# Filter for S3 IPs in a specific region, e.g., us-east-1&lt;BR /&gt;s3_ips = [range["ip_prefix"] for range in ip_ranges["prefixes"]&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if range["service"] == "S3" and range["region"] == "us-east-1"]&lt;BR /&gt;&lt;BR /&gt;print(s3_ips)&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;I was able to write a more precise Logs Insight query to check for traffic between our NAT and S3, to check if there was still some traffic:&lt;/P&gt;&lt;PRE&gt;# downloads from s3&lt;BR /&gt;filter (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;dstAddr like '10.0.0.1' and (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;isIpv4InSubnet(srcAddr, '18.34.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '54.231.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '52.216.0.0/15')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '18.34.232.0/21')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '16.182.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '3.5.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '44.192.134.240/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '44.192.140.64/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;)&lt;BR /&gt;) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;PRE&gt;# uploads to s3&lt;BR /&gt;filter (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;srcAddr like '10.0.0.1' and (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;isIpv4InSubnet(dstAddr, '18.34.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '54.231.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '52.216.0.0/15')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '18.34.232.0/21')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '16.182.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '3.5.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '44.192.134.240/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '44.192.140.64/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;)&lt;BR /&gt;) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;After running these queries, I confirmed&amp;nbsp;there was no traffic, downloading nor uploading, between NAT and S3, right after the S3 Gateway Endpoint was deployed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;NOTE:&lt;/P&gt;&lt;P&gt;A great tool I didn't know before this was &lt;STRONG&gt;AWS&amp;nbsp;Reachability Analyzer&lt;/STRONG&gt;, which I used to check connectivity between instance's ENIs in private subnets, and the S3 Gateway Endpoint.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 21 Feb 2024 14:04:43 GMT</pubDate>
    <dc:creator>RaulPino</dc:creator>
    <dc:date>2024-02-21T14:04:43Z</dc:date>
    <item>
      <title>Networking reduction cost for NATGateway and Shared Catalog</title>
      <link>https://community.databricks.com/t5/administration-architecture/networking-reduction-cost-for-natgateway-and-shared-catalog/m-p/60069#M865</link>
      <description>&lt;P&gt;Use case and context:&lt;/P&gt;&lt;P&gt;We have a databricks workspace in a specific region, reading and writing files from/to the same region.&lt;/P&gt;&lt;P&gt;We also read from a Shared Catalog in a different company, a data provider, which is pointing to multi-region s3 buckets.&lt;/P&gt;&lt;P&gt;The result is that we are incurring in high&amp;nbsp;NATGateway-Bytes&amp;nbsp;and&amp;nbsp;DataTransfer-Regional-Bytes bills.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Measures that we took to reduce cost:&lt;/P&gt;&lt;P&gt;Implemented a S3 Gateway Endpoint, to route any traffic between instances managed by databricks in private subnets and S3 in the same region. The idea is that this should reduce cost while reading and writing to our S3 in the same region, and reading from the shared catalog pointing to multiregion buckets, but we are still seeing no reduction on&amp;nbsp;NATGateway-Bytes&amp;nbsp;and&amp;nbsp;DataTransfer-Regional-Bytes costs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Are these costs inevitable? What could be wrong in our networking setup? Is there any other alternative?&lt;/P&gt;</description>
      <pubDate>Tue, 13 Feb 2024 15:27:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/networking-reduction-cost-for-natgateway-and-shared-catalog/m-p/60069#M865</guid>
      <dc:creator>RaulPino</dc:creator>
      <dc:date>2024-02-13T15:27:22Z</dc:date>
    </item>
    <item>
      <title>Re: Networking reduction cost for NATGateway and Shared Catalog</title>
      <link>https://community.databricks.com/t5/administration-architecture/networking-reduction-cost-for-natgateway-and-shared-catalog/m-p/61363#M908</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;for all the suggestions.&lt;/P&gt;&lt;P&gt;After some days of monitoring NAT cost, I realized that the implementation of the S3 Gateway Endpoint it was actually working, the problem was that I thought that this change would be reflected right away in terms of costs, but I found out that this can take a bit more than 24 hours to be visible in AWS Cost Explorer.&lt;/P&gt;&lt;P&gt;From AWS docs:&amp;nbsp;&lt;A href="https://docs.aws.amazon.com/cost-management/latest/userguide/ce-exploring-data.html" target="_blank" rel="noopener"&gt;https://docs.aws.amazon.com/cost-management/latest/userguide/ce-exploring-data.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;All costs reflect your usage up to the previous day. For example, if today is December 2, the data includes your usage through December 1.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We already had AWS Flow Logs implemented in the VPC, so using the following query in Cloudwatch Logs Insight, I saw some reduction the first day, but I wasn't sure if it was a real reduction, or just casual less traffic:&lt;/P&gt;&lt;PRE&gt;# downloads in total&lt;BR /&gt;filter (dstAddr like '10.0.0.1' and not isIpv4InSubnet(srcAddr, '10.0.0.0/16')) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;PRE&gt;# uploads in total&lt;BR /&gt;filter (srcAddr like '10.0.0.1' and not isIpv4InSubnet(dstAddr, '10.0.0.0/16')) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;P&gt;So I needed to confirm that actually all inbound/outbound traffic between the subnets and S3 was going through the S3 Gateway Endpoint. After some research I found all AWS IP ranges here &lt;A href="https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html" target="_blank" rel="noopener"&gt;https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html&lt;/A&gt;, then using this simple script to get only the S3 IP ranges:&lt;/P&gt;&lt;PRE&gt;import json&lt;BR /&gt;&lt;BR /&gt;# Load the AWS IP ranges JSON file&lt;BR /&gt;with open('aws-ips.json') as file:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;ip_ranges = json.load(file)&lt;BR /&gt;&lt;BR /&gt;# Filter for S3 IPs in a specific region, e.g., us-east-1&lt;BR /&gt;s3_ips = [range["ip_prefix"] for range in ip_ranges["prefixes"]&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if range["service"] == "S3" and range["region"] == "us-east-1"]&lt;BR /&gt;&lt;BR /&gt;print(s3_ips)&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;I was able to write a more precise Logs Insight query to check for traffic between our NAT and S3, to check if there was still some traffic:&lt;/P&gt;&lt;PRE&gt;# downloads from s3&lt;BR /&gt;filter (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;dstAddr like '10.0.0.1' and (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;isIpv4InSubnet(srcAddr, '18.34.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '54.231.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '52.216.0.0/15')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '18.34.232.0/21')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '16.182.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '3.5.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '44.192.134.240/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '44.192.140.64/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;)&lt;BR /&gt;) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;PRE&gt;# uploads to s3&lt;BR /&gt;filter (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;srcAddr like '10.0.0.1' and (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;isIpv4InSubnet(dstAddr, '18.34.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '54.231.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '52.216.0.0/15')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '18.34.232.0/21')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '16.182.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '3.5.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '44.192.134.240/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '44.192.140.64/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;)&lt;BR /&gt;) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;After running these queries, I confirmed&amp;nbsp;there was no traffic, downloading nor uploading, between NAT and S3, right after the S3 Gateway Endpoint was deployed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;NOTE:&lt;/P&gt;&lt;P&gt;A great tool worth mentioning was &lt;STRONG&gt;AWS&amp;nbsp;Reachability Analyzer&lt;/STRONG&gt;, which I used to check connectivity between instance's ENIs in private subnets, and the S3 Gateway Endpoint.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Feb 2024 13:56:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/networking-reduction-cost-for-natgateway-and-shared-catalog/m-p/61363#M908</guid>
      <dc:creator>RaulPino</dc:creator>
      <dc:date>2024-02-21T13:56:20Z</dc:date>
    </item>
    <item>
      <title>Re: Networking reduction cost for NATGateway and Shared Catalog</title>
      <link>https://community.databricks.com/t5/administration-architecture/networking-reduction-cost-for-natgateway-and-shared-catalog/m-p/61366#M910</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;for all the suggestions.&lt;/P&gt;&lt;P&gt;After some days of monitoring NAT cost, I realized that the implementation of the S3 Gateway Endpoint it was actually working, the problem was that I thought that this change would be reflected right away in terms of costs, but I found out that this can take a bit more than 24 hours to be visible in AWS Cost Explorer.&lt;/P&gt;&lt;P&gt;From AWS docs:&amp;nbsp;&lt;A href="https://docs.aws.amazon.com/cost-management/latest/userguide/ce-exploring-data.html" target="_blank"&gt;https://docs.aws.amazon.com/cost-management/latest/userguide/ce-exploring-data.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;All costs reflect your usage up to the previous day. For example, if today is December 2, the data includes your usage through December 1.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We already had AWS Flow Logs implemented in the VPC, so using the following query in Cloudwatch Logs Insight, I saw some reduction the first day, but I wasn't sure if it was a real reduction, or just casual less traffic:&lt;/P&gt;&lt;P&gt;# downloads in total&lt;BR /&gt;filter (dstAddr like '10.0.0.1' and not isIpv4InSubnet(srcAddr, '10.0.0.0/16')) | stats sum(bytes) as bytesTransferred&lt;BR /&gt;# uploads in total&lt;BR /&gt;filter (srcAddr like '10.0.0.1' and not isIpv4InSubnet(dstAddr, '10.0.0.0/16')) | stats sum(bytes) as bytesTransferred&lt;BR /&gt;So I needed to confirm that actually all inbound/outbound traffic between the subnets and S3 was going through the S3 Gateway Endpoint. After some research I found all AWS IP ranges here &lt;A href="https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html" target="_blank"&gt;https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html&lt;/A&gt;, then using this simple script to get only the S3 IP ranges:&lt;/P&gt;&lt;P&gt;import json&lt;/P&gt;&lt;P&gt;# Load the AWS IP ranges JSON file&lt;BR /&gt;with open('aws-ips.json') as file:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;ip_ranges = json.load(file)&lt;/P&gt;&lt;P&gt;# Filter for S3 IPs in a specific region, e.g., us-east-1&lt;BR /&gt;s3_ips = [range["ip_prefix"] for range in ip_ranges["prefixes"]&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if range["service"] == "S3" and range["region"] == "us-east-1"]&lt;/P&gt;&lt;P&gt;print(s3_ips)&lt;BR /&gt;I was able to write a more precise Logs Insight query to check for traffic between our NAT and S3, to check if there was still some traffic:&lt;/P&gt;&lt;P&gt;# downloads from s3&lt;BR /&gt;filter (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;dstAddr like '10.0.0.1' and (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;isIpv4InSubnet(srcAddr, '18.34.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '54.231.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '52.216.0.0/15')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '18.34.232.0/21')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '16.182.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '3.5.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '44.192.134.240/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '44.192.140.64/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;)&lt;BR /&gt;) | stats sum(bytes) as bytesTransferred&lt;BR /&gt;# uploads to s3&lt;BR /&gt;filter (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;srcAddr like '10.0.0.1' and (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;isIpv4InSubnet(dstAddr, '18.34.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '54.231.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '52.216.0.0/15')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '18.34.232.0/21')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '16.182.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '3.5.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '44.192.134.240/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '44.192.140.64/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;)&lt;BR /&gt;) | stats sum(bytes) as bytesTransferred&lt;BR /&gt;After running these queries, I confirmed&amp;nbsp;there was no traffic, downloading nor uploading, between NAT and S3, right after the S3 Gateway Endpoint was deployed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;NOTE:&lt;/P&gt;&lt;P&gt;A great tool I didn't know before this was AWS&amp;nbsp;Reachability Analyzer, which I used to check connectivity between instance's ENIs in private subnets, and the S3 Gateway Endpoint.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Feb 2024 13:57:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/networking-reduction-cost-for-natgateway-and-shared-catalog/m-p/61366#M910</guid>
      <dc:creator>RaulPino</dc:creator>
      <dc:date>2024-02-21T13:57:22Z</dc:date>
    </item>
    <item>
      <title>Re: Networking reduction cost for NATGateway and Shared Catalog</title>
      <link>https://community.databricks.com/t5/administration-architecture/networking-reduction-cost-for-natgateway-and-shared-catalog/m-p/61368#M911</link>
      <description>&lt;P&gt;Thanks &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;for all the suggestions.&lt;/P&gt;&lt;P&gt;After some days of monitoring NAT cost, I realized that the implementation of the S3 Gateway Endpoint it was actually working, the problem was that I thought that this change would be reflected right away in terms of costs, but I found out that this can take a bit more than 24 hours to be visible in AWS Cost Explorer.&lt;/P&gt;&lt;P&gt;From AWS docs:&amp;nbsp;&lt;A href="https://docs.aws.amazon.com/cost-management/latest/userguide/ce-exploring-data.html" target="_blank"&gt;https://docs.aws.amazon.com/cost-management/latest/userguide/ce-exploring-data.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;All costs reflect your usage up to the previous day. For example, if today is December 2, the data includes your usage through December 1.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We already had AWS Flow Logs implemented in the VPC, so using the following query in Cloudwatch Logs Insight, I saw some reduction the first day, but I wasn't sure if it was a real reduction, or just casual less traffic:&lt;/P&gt;&lt;PRE&gt;# downloads in total&lt;BR /&gt;filter (dstAddr like '10.0.0.1' and not isIpv4InSubnet(srcAddr, '10.0.0.0/16')) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;PRE&gt;# uploads in total&lt;BR /&gt;filter (srcAddr like '10.0.0.1' and not isIpv4InSubnet(dstAddr, '10.0.0.0/16')) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;So I needed to confirm that actually all inbound/outbound traffic between the subnets and S3 was going through the S3 Gateway Endpoint. After some research I found all AWS IP ranges here &lt;A href="https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html" target="_blank"&gt;https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html&lt;/A&gt;, then using this simple script to get only the S3 IP ranges:&lt;/P&gt;&lt;PRE&gt;import json&lt;BR /&gt;&lt;BR /&gt;# Load the AWS IP ranges JSON file&lt;BR /&gt;with open('aws-ips.json') as file:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;ip_ranges = json.load(file)&lt;BR /&gt;&lt;BR /&gt;# Filter for S3 IPs in a specific region, e.g., us-east-1&lt;BR /&gt;s3_ips = [range["ip_prefix"] for range in ip_ranges["prefixes"]&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if range["service"] == "S3" and range["region"] == "us-east-1"]&lt;BR /&gt;&lt;BR /&gt;print(s3_ips)&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;I was able to write a more precise Logs Insight query to check for traffic between our NAT and S3, to check if there was still some traffic:&lt;/P&gt;&lt;PRE&gt;# downloads from s3&lt;BR /&gt;filter (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;dstAddr like '10.0.0.1' and (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;isIpv4InSubnet(srcAddr, '18.34.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '54.231.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '52.216.0.0/15')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '18.34.232.0/21')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '16.182.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '3.5.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '44.192.134.240/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(srcAddr, '44.192.140.64/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;)&lt;BR /&gt;) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;PRE&gt;# uploads to s3&lt;BR /&gt;filter (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;srcAddr like '10.0.0.1' and (&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;isIpv4InSubnet(dstAddr, '18.34.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '54.231.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '52.216.0.0/15')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '18.34.232.0/21')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '16.182.0.0/16')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '3.5.0.0/19')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '44.192.134.240/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or isIpv4InSubnet(dstAddr, '44.192.140.64/28')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;)&lt;BR /&gt;) | stats sum(bytes) as bytesTransferred&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;After running these queries, I confirmed&amp;nbsp;there was no traffic, downloading nor uploading, between NAT and S3, right after the S3 Gateway Endpoint was deployed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;NOTE:&lt;/P&gt;&lt;P&gt;A great tool I didn't know before this was &lt;STRONG&gt;AWS&amp;nbsp;Reachability Analyzer&lt;/STRONG&gt;, which I used to check connectivity between instance's ENIs in private subnets, and the S3 Gateway Endpoint.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 21 Feb 2024 14:04:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/networking-reduction-cost-for-natgateway-and-shared-catalog/m-p/61368#M911</guid>
      <dc:creator>RaulPino</dc:creator>
      <dc:date>2024-02-21T14:04:43Z</dc:date>
    </item>
  </channel>
</rss>

