In the last few days, i had to setup a complete AWS infrastructure for a startup. The objective was to have different ec2 machines that are auto-scaled into existence access an s3 bucket on initialization to get some configuration that shouldn’t be baked into a snapshot.
The script was working fine when i was running it from a public instance that i would SSH into as a test but i couldn’t get the connection to work in a private ec2 instance.
I researched a little bit and found that private instances do not have access to the internet. A private subnet is a subnet where you do not assign public ip addresses to their ec2 instances that get booted into it. The other side of this is usually a public subnet with an internet gateway (IGW) attached to it using a routing table that routes “0.0.0.0/0” to the IGW.
Routing, Nat gateways and region specifiers
To ensure your machine has access to the internet, there are 3 things you need to go.
- Attach a NAT gateway to your private subnet(s)
- Create a routing table for that subnet that sends “0.0.0.0/0” to the nat gateway
- Ensute your “aws s3” command includes a “–region something” when you run it
The different between an IGW and a NATGW
So what is the different between the NAT Gateway and the Internet Gateway (IGW) ? The different is a little like that router at home. Your router is usually configured as a NAT Gateway. It translates the packets that go out to the internet into a form that allows the response to reach the sender back but it doesn’t allow traffic into your house per se.
The Internet Gateway would be more like the DMZ zone configuration on your router, it allows internet users to reach your subnet and then get distributed to an instance, provided that you configure the routing table attached to the subnet to route the content of requests to them. More importantly, they are used to allow public ip enbled EC2 instances to be NATed in and out while private instances cannot use that.
Usually, you would probably have some kind of public EC2 machine that has a public IP that would act as an SSH bastion or a load balancer such as an “Elastic Load Balancer” (ELB) that would be in your VPC, in a public subnet and then will receive the traffic using the IGW.
There isn’t a really big different in the routing tables for the public subnets and the private subnets. In your VPC, with a standard simple public/private scenario, you would have a public subnet and a private subnet. You will have as many subnets has you have availability zones but you will have only 2 routing tables, one for the private subnets and one for the public subnets.
In the public subnet, you will leave the default routing entry that routes everything for your VPC’s CIDR block to the local subnets attached to the routing table. Then, you will have a “0.0.0.0/0” (everything else) route that points to the IGW.
In the private subnet, you will leave the default routing entry that routes everyting for your VPC’s CIDR block to the local subnets attached to the routing table. Then, you will have a “0.0.0.0/” (everything else) route that points to the NATGW.
This means that you should not put any public IP addressed instance in a private subnet, they should be in the public subnets and use the IGW to go outside and receive incoming traffic. All other instances, that do not have a public ip, should be in the private subnet and will use the NATGW to go out to the internet.
Do i need a NAT Gateway?
Not necessarily. If you have a private subnet, with private instances and you only need access to S3, then add a endpoint in your VPC and ditch that NATGW, you won’t need it.
I need it because i do more than just S3 access. When i want to rebuild my snapshot image, i just take an existing private instance, clean it up and then snapshot it again, but that usually requires APT commands which you won’t be able to use if you don’t have a public internet access.
Ok, what about that –region option?
When you do an AWS call from an EC2 instance, it will use the local credentials chain to identify who is doing the call and what permissions it has. If you do not have any AWS cli configuration, then, there are no regions configured on the machine. Should you configure one? Up to you. I remember in a previous job that we had tried with a configuration and then with a –region option tacked on to each of our commands. We had issues with both from memory and we ended up configuring it on the machine i think.
On this new project, i decided to go with the “–region” option specification on my commands. I will maybe regret later but i have a much simpler architecture and devops structure for now so it’s no big deal.
You can use your AWS S3 commands with a –region option or just configure it, it is as you wish but remember to use any of the two or your command will just time out and you’ll be scratching your head for some time.