cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

AWS NAT (Network Address Translation) Automated On-demand Destruct / Create

csmcpherson
New Contributor II

Hi folks, 

Our company typically uses Databrick during a 12 hour block, however the AWS NAT for elastic compute is up 24 hours, and I'd rather not pay for those hours.

I gather AWS lambda and cloudwatch can be used to schedule / trigger NAT destruction and creation. 

1. Has anyone tried this with success, and can you provide guidance on best practice here?
2. Are there any important considerations to bear in mind (ie: will removal of NAT also destroy attached route tables / security groups / elastic IP allocation)?

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz_Fatma
Community Manager
Community Manager

Hi @csmcpherson

  • Yes, you can indeed use AWS Lambda and CloudWatch to schedule and trigger NAT gateway destruction and creation. This approach allows you to save costs by only having the NAT gateway active during the hours you need it.
  • Here are the general steps:
    • Create a Lambda function that interacts with the AWS SDK to manage the NAT gateway.
    • Set up a CloudWatch Events rule to trigger the Lambda function at the desired schedule (e.g., start at the beginning of your 12-hour block and stop at the end).
    • In the Lambda function, you can use the create_nat_gateway and delete_nat_gateway methods to create and destroy the NAT gateway.
  • Keep in mind that you’ll need appropriate IAM permissions for the Lambda function to manage NAT gateways.
  • When destroying a NAT gateway, it does not directly affect attached route tables, security groups, or elastic IP allocations. These resources remain intact.
  • However, consider the following:
    • Route Tables: Ensure that your private subnets are correctly associated with the new NAT gateway after recreation.
    • Elastic IPs: If you release the elastic IP associated with the NAT gateway during destruction, it becomes available for other resources. You might want to reassign it to the new NAT gateway.
    • Security Groups: NAT gateways don’t have security groups directly associated with them, but they do have an implicit security group that allows outbound traffic. This behavior remains consistent during recreation.
    • Test thoroughly in a non-production environment before implementing this in your production setup.

Remember to adapt these steps to your specific environment and requirements.

If you encounter any issues during implementation, feel free to ask for further assistance! 😊

 

View solution in original post

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @csmcpherson

  • Yes, you can indeed use AWS Lambda and CloudWatch to schedule and trigger NAT gateway destruction and creation. This approach allows you to save costs by only having the NAT gateway active during the hours you need it.
  • Here are the general steps:
    • Create a Lambda function that interacts with the AWS SDK to manage the NAT gateway.
    • Set up a CloudWatch Events rule to trigger the Lambda function at the desired schedule (e.g., start at the beginning of your 12-hour block and stop at the end).
    • In the Lambda function, you can use the create_nat_gateway and delete_nat_gateway methods to create and destroy the NAT gateway.
  • Keep in mind that you’ll need appropriate IAM permissions for the Lambda function to manage NAT gateways.
  • When destroying a NAT gateway, it does not directly affect attached route tables, security groups, or elastic IP allocations. These resources remain intact.
  • However, consider the following:
    • Route Tables: Ensure that your private subnets are correctly associated with the new NAT gateway after recreation.
    • Elastic IPs: If you release the elastic IP associated with the NAT gateway during destruction, it becomes available for other resources. You might want to reassign it to the new NAT gateway.
    • Security Groups: NAT gateways don’t have security groups directly associated with them, but they do have an implicit security group that allows outbound traffic. This behavior remains consistent during recreation.
    • Test thoroughly in a non-production environment before implementing this in your production setup.

Remember to adapt these steps to your specific environment and requirements.

If you encounter any issues during implementation, feel free to ask for further assistance! 😊

 

@Kaniz_Fatma 
Thanks for your reply. 


I created some Lambda functions to execute the NAT delete / create approach, factoring in route tables, elastic IP details and security groups per the forum guide. 

However, there is a problem with Databricks not being able to connect to the EC2 resources - the clusters can initiate EC2 instance start up, but cannot connect to the resource, or even terminate it, and Databricks (DLT and compute) is constantly "waiting for resource", even though the instance is running in AWS.

Is there anything that I may have missed?

Lambda functions are below:
== delete lambda ==
csmcpherson_0-1721090211510.png

== create lambda ==

csmcpherson_0-1721090500932.pngcsmcpherson_2-1721090284289.png
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!