Daniel López Azaña

Theme

Social Media

Featured Project

Multilevel network security architecture in AWS with VPC, NAT Gateway and perimeter protection

Design and implementation of enterprise AWS security architecture with multilevel VPC, public and private subnet separation, NAT Gateway for controlled outbound traffic, multi-AZ deployment for high availability, AWS Shield for DDoS protection, AWS WAF for application security and comprehensive backup strategy with AWS Backup.

Organizations operating critical infrastructure on AWS face the constant challenge of balancing security with operability. Throughout multiple projects for different clients in highly regulated sectors, I have designed and implemented enterprise network security architectures that transform vulnerable AWS infrastructures with servers directly exposed to the internet into robust, multilevel, and resilient environments.

AWS network security architecture with multilevel VPC, public/private subnets and NAT Gateway

These clients shared similar vulnerabilities: internal servers with public IP addresses directly accessible from the internet, critical services running in a single availability zone without redundancy, and absence of perimeter security controls that could stop DDoS attacks or web application exploits. The existing infrastructure not only represented a significant security risk but also lacked the necessary resilience to ensure operational continuity in the face of infrastructure failures or malicious attacks.

As an AWS cloud architect specialized in security, I completely transformed these infrastructures by implementing multilevel VPC architectures with strict separation between public and private subnets, multi-availability zone deployment for high availability, NAT Gateways for granular outbound traffic control, AWS Shield and WAF for perimeter protection against attacks, and AWS Backup for disaster recovery. This architectural transformation eliminated single points of failure, drastically reduced the attack surface, and established a defense-in-depth security model that protects critical assets across multiple layers.

Table of Contents

The Challenge: Exposed and Vulnerable Monolithic Infrastructure

The initial state of these infrastructures presented critical security vulnerabilities that exposed businesses to significant risks of compromise, data loss, and operational disruptions.

Problems Identified in Original Architecture

Servers with public IPs directly exposed to internetAll EC2 instances directly accessible from any location on the internet without perimeter protection layer.
Flat network architecture without segmentationVPC with only public subnets, no separation between front-end and back-end services, no private subnets to isolate internal resources.
Single availability zone: single point of failureEntire infrastructure concentrated in a single availability zone, vulnerable to complete data center interruptions.
No protection against DDoS attacks or WAFAbsence of distributed denial of service attack mitigation and web application firewall to detect exploits.
Single routing table without traffic controlSimplistic routing configuration without traffic flow segmentation or granular internet access control.
Databases and critical services externally accessibleRDS, EFS, and other internal services located in subnets with internet routes, exposing unnecessary attack vectors.

Security Solution Requirements

The transformation required a complete architectural redesign that established:

  • Multilevel VPC architecture with strict separation between public subnets (DMZ) and private subnets (internal services).
  • Multi-AZ deployment distributing critical resources across 3 availability zones to eliminate single points of failure.
  • Redundant NAT Gateway to allow outbound traffic from private subnets without exposing public IPs.
  • Layered Security Groups implementing least privilege principle with specific rules per service role.
  • Additional Network ACLs as second firewall layer at subnet level.
  • Advanced AWS Shield for protection against volumetric, state, and application-layer DDoS attacks.
  • AWS WAF with custom rules for detection and blocking of common attack patterns (SQL injection, XSS, etc.).
  • Automated AWS Backup with retention and disaster recovery policies.
  • Secured bastion host as single administrative entry point.
  • Route 53 private DNS for internal name resolution without external exposure.

Implemented Multilevel Security Architecture

The implemented solution establishes a defense-in-depth model with multiple overlapping security layers (solutions) protecting the infrastructure from the perimeter to the most critical internal services.

Main Security Components

ComponentTechnologySecurity layerPurpose
AWS ShieldDDoS ProtectionExternal perimeterAutomatic protection against volumetric network and transport attacks
AWS WAFWeb Application FirewallApplication perimeterHTTP/HTTPS traffic filtering with custom anti-exploit rules
Multilevel VPCPublic/private subnetsNetwork segmentationIsolation of front-end and back-end services in differentiated subnets
NAT GatewayNetwork Address TranslationOutbound traffic controlControlled outbound access from private subnets without public IPs
Internet GatewayVPC GatewayControlled entryBidirectional communication only for resources in public subnets
Security GroupsStateful EC2 firewallInstance controlSpecific inbound/outbound traffic rules per service role
Network ACLsStateless subnet firewallSubnet controlAdditional deny/allow rules at complete subnet level
Route TablesRouting tablesFlow controlSpecific routes determining accessibility of each subnet
Bastion HostHardened EC2 + 2FAAdministrative accessSingle SSH entry point with multifactor authentication
AWS BackupBackup serviceDisaster recoveryAutomated EC2, RDS, EFS snapshots with configurable retention

Multilevel VPC Architecture Diagram

Multilevel VPC network architecture with public and private subnet separation across multiple availability zones

Solution 1: VPC Restructuring with Multilevel Architecture

Public and Private Subnet Design

The fundamental redesign consisted of creating a completely new VPC with strict segmentation between public subnets (accessible from the internet) and private subnets (completely isolated), distributed across 3 availability zones to ensure high availability.

Implemented IP addressing scheme:

Subnet nameCIDRAvailability ZoneTypePurpose
public-1a10.0.11.0/24us-east-1a / eu-west-1aPublicLoad balancers, bastion host, NAT Gateway
public-1b10.0.12.0/24us-east-1b / eu-west-1bPublicRedundant load balancers, NAT Gateway
public-1c10.0.13.0/24us-east-1c / eu-west-1cPublicRedundant load balancers, NAT Gateway
private-1a10.0.16.0/24us-east-1a / eu-west-1aPrivateApplication servers, internal services
private-1b10.0.17.0/24us-east-1b / eu-west-1bPrivateRedundant application servers
private-1c10.0.18.0/24us-east-1c / eu-west-1cPrivateRedundant application servers

Note: The gap between 10.0.13.0/24 and 10.0.16.0/24 reserves IP ranges for future additional availability zones that AWS may enable.

Public Subnets: Controlled Exposure Zone

Public subnets are network segments specifically designed to host resources that must be accessible from the internet under strict control. These resources act as a protected entry layer before reaching critical internal services.

Resources deployed in public subnets:

Application Load Balancers (ALB)Distribute incoming HTTPS traffic to backend servers in private subnets. Act as reverse proxy with SSL/TLS termination.
Bastion HostSingle administrative entry point via SSH with 2FA authentication. Enables controlled access to internal servers through encrypted tunnels.
NAT GatewayEnables outbound traffic from private subnets to internet (updates, external APIs) while blocking unsolicited inbound connections.
Elastic IP addressesStatic public IP addresses assigned to resources in public subnets (Load Balancers, NAT Gateway, Bastion).

Internet Gateway: Bidirectional Entry Point

The Internet Gateway (IGW) is the fundamental component that enables bidirectional communication between the VPC and the public internet. It is an AWS-managed service, highly available and horizontally scalable without manual intervention.

Technical characteristics of Internet Gateway:

  • Automatic 1:1 NAT translation: Automatically converts private IP addresses of instances in public subnets to their associated public Elastic IPs.
  • Unlimited scalability: AWS manages all necessary capacity and redundancy without configurable bandwidth limits.
  • Stateful: Automatically allows responses to connections established from public subnets without explicit rules.
  • No additional cost: Included in VPC pricing without traffic processing charges.

Inbound traffic flow through Internet Gateway:

Internet traffic passes through AWS Shield and WAF before reaching the Internet Gateway, which routes it to the appropriate Route Table for the public subnet. From there, the Application Load Balancer distributes traffic to Target Groups, which forward requests to EC2 instances in private subnets.

Routing table for public subnets:

Public subnets have a default route (0.0.0.0/0) pointing to the Internet Gateway, enabling direct internet communication. All intra-VPC traffic uses the local route, while internet-bound traffic is directed through the Internet Gateway.

Internet Gateway configuration and routing tables for public subnets

Private Subnets: Complete Isolation from Public Internet

Private subnets represent the protected core of the infrastructure where critical services reside completely isolated from direct internet access. No instance in private subnets has a public IP assigned.

Resources deployed in private subnets:

  • Application servers: EC2 instances running web applications, REST APIs, backend services.
  • RDS databases: PostgreSQL, MySQL, Aurora with Multi-AZ mode for high availability.
  • EFS volumes: Shared NFS storage for static files, user uploads, shared assets.
  • Internal services: Redis, Memcached, RabbitMQ, queue systems, asynchronous processing workers.

Routing table for private subnets:

Private subnets have NO direct route to Internet Gateway, guaranteeing complete isolation. Their default route points to a NAT Gateway (see Solution 2) that allows only outbound traffic initiated internally. All intra-VPC traffic uses the local route.

Fundamental security implications:

No inbound access from internet: Private subnets have no route to Internet Gateway. Materially impossible to connect directly from internet to internal services.
Controlled access from public subnets: Load Balancers in public subnets route verified HTTPS traffic to backend servers in private subnets through strict Security Groups.
Intra-VPC communication without internet: All subnets (public and private) communicate directly through local routing (10.0.0.0/16) without going out to public internet.
Controlled outbound traffic: Servers in private subnets can initiate outbound connections (updates, APIs) through NAT Gateway, but block unsolicited inbound connections.

Solution 2: NAT Gateway for Outbound Traffic Control

NAT Gateway in multi-AZ architecture showing outbound traffic flow from private subnets

Multi-AZ Redundant NAT Gateway Architecture

NAT Gateways allow servers and services located in private subnets without public IP to initiate outbound connections to the internet (security updates, external APIs, third-party services) while remaining completely inaccessible from the outside.

NAT Gateway security features:

Outbound traffic onlyNAT Gateway allows connections initiated from inside but blocks unsolicited inbound connections from internet.
Internal IP obfuscationInternal servers have no assigned public IP. NAT Gateway translates their private IPs to its elastic public IP.
Automatic high availabilityNAT Gateway deployed independently in each AZ to avoid costly cross-AZ traffic and single points of failure.
Managed automatic scalabilityAWS automatically manages NAT Gateway scalability up to 45 Gbps without manual configuration.

Traffic Flow Through NAT Gateway

When a server in private subnet needs to access an external service:

DiagramDiagram

Security benefits:

  1. Minimized attack surface: Internal servers are completely invisible from internet (no public IP).
  2. Impossibility of direct inbound attacks: An attacker cannot scan or connect to internal services.
  3. Centralized auditing: All outbound traffic passes through NAT Gateways monitored with VPC Flow Logs.
  4. Resilience: NAT Gateway is a managed high-availability service without operational maintenance.

Solution 3: Perimeter Security with AWS Shield and WAF

AWS Shield: Protection Against DDoS Attacks

AWS Shield Standard is automatically enabled on all AWS resources at no additional cost, providing protection against the most common DDoS attacks at network and transport layers (OSI model layers 3 and 4).

Components protected by AWS Shield in the architecture:

AWS Global Accelerator: Static Anycast IPs include integrated Shield protection against volumetric SYN flood, UDP flood, reflection attacks.
Application Load Balancer: Automatic protection against HTTP flood, slowloris, and other application-layer attacks.
Elastic IP addresses: Public IP addresses assigned to NAT Gateways protected against resource exhaustion attacks.

Automatic protection mechanisms:

  • Inline detection: Real-time traffic analysis to identify anomalous patterns characteristic of DDoS.
  • Automatic mitigation: Immediate activation of countermeasures without manual intervention or waiting time.
  • Malicious traffic absorption: Distributed AWS infrastructure absorbs volumetric attacks before they reach resources.

Related Project

I worked on another series of projects specifically addressing advanced DDoS protection with AWS Shield, global latency optimization, and static Anycast IP addresses with AWS Global Accelerator implemented in high-availability architectures with global load balancing for corporate firewall configurations and enterprise-level performance.

View project: Global load balancing and high availability on AWS

AWS WAF: Web Application Firewall

AWS WAF configuration with custom rules for protection against SQL injection, XSS and rate limiting

AWS WAF (Web Application Firewall) provides an additional security layer by inspecting HTTP/HTTPS traffic and blocking malicious requests according to defined custom rules.

Implemented WAF rules:

Rule typeTargetDescription
SQL InjectionSQL injection attacksDetects and blocks SQL code injection attempts in URL parameters, headers, body
Cross-Site ScriptingXSS attacksPrevents injection of malicious JavaScript scripts in form fields
Rate limitingBrute force attacksLimits requests per IP (e.g., 2000 req/5min) to prevent scans and brute force
Geo-blockingGeographic restrictionBlocks traffic from specific countries with high malicious activity
IP reputation listsKnown IP blockingIntegration with AWS and third-party reputation lists to block malicious IPs
Bot controlBot detectionIdentifies and controls legitimate and illegitimate automated traffic
Custom rulesSpecific protectionCustom rules based on observed attack patterns

AWS WAF integrates directly with Application Load Balancers, inspecting all HTTP/HTTPS traffic before it reaches backend EC2 instances. Custom WAF rules include dynamic whitelist updates via Lambda functions to maintain authorized corporate IPs without manual intervention.

Solution 4: Layered Security Groups and Network ACLs

Security Groups: Stateful Firewall at Instance Level

Security Groups function as stateful virtual firewalls controlling inbound and outbound traffic for individual EC2 instances. Being stateful, they automatically allow response traffic to established connections.

Security Groups strategy by layers:

LayerSecurity GroupInbound rulesOutbound rules
Load Balancersg-alb-public0.0.0.0/0:443 (HTTPS)
0.0.0.0/0:80 (HTTP)
sg-web-servers:8000 (App)
Web Serverssg-web-serverssg-alb-public:8000
sg-bastion:22 (SSH)
0.0.0.0/0:443 (HTTPS outbound)
sg-database:3306 (MySQL)
Databasesg-databasesg-web-servers:3306
sg-bastion:3306 (tunnel)
None
Bastion Hostsg-bastionCorporate IPs:12119 (SSH)sg-web-servers:22
sg-database:22
NAT Gatewaysg-natsg-web-servers:any0.0.0.0/0:any

Applied design principles:

  1. Least privilege: Only strictly necessary ports are opened for the specific service function.
  2. Security Group reference: Instead of specifying IPs, other Security Groups are referenced (e.g., sg-web-servers can connect to sg-database).
  3. Separation of responsibilities: Each application layer has its own Security Group with specific rules.
  4. No “allow all” rules: 0.0.0.0/0 is never used for traffic between internal layers.

Network ACLs: Stateless Firewall at Subnet Level

Network ACLs (Access Control Lists) operate at the complete subnet level as an additional stateless security layer. Unlike Security Groups, ACLs are stateless (don’t track connection state) and evaluate numbered rules in sequential order.

Network ACL for private subnets:

Rule #TypeProtocolPortSource/DestinationActionPurpose
100InboundTCP800010.0.11.0/24 (public)ALLOWTraffic from ALB
110InboundTCP2210.0.11.0/24 (public)ALLOWSSH from bastion
120InboundTCP1024-655350.0.0.0/0ALLOWEphemeral port responses
*InboundALLALL0.0.0.0/0DENYDeny rest
100OutboundTCP330610.0.0.0/16ALLOWMySQL to RDS
110OutboundTCP4430.0.0.0/0ALLOWHTTPS outbound
120OutboundTCP1024-655350.0.0.0/0ALLOWEphemeral port responses
*OutboundALLALL0.0.0.0/0DENYDeny rest

The combination of Security Groups (stateful, instance level) and Network ACLs (stateless, subnet level) provides defense in depth with two independent layers of traffic filtering.

Layered defense advantages:

  • If an attacker discovers a vulnerability in Security Group rules, Network ACL provides second line of defense.
  • Network ACLs allow explicit blocks (DENY rules) that Security Groups don’t support.
  • The combination provides detailed logs in VPC Flow Logs for forensic analysis.

Solution 5: Multi-AZ Deployment for Resilience

Critical Resource Distribution Across Availability Zones

Multi-AZ (multi-availability zone) deployment distributes critical resources across 3 physically separate data centers within the same AWS region, eliminating single points of failure at infrastructure level.

Multi-AZ distributed components:

Availability Zone 1aPublic + private subnet
Primary NAT Gateway
Bastion host
Application servers
Availability Zone 1bPublic + private subnet
Redundant NAT Gateway
Application servers
RDS standby replica
Availability Zone 1cPublic + private subnet
Redundant NAT Gateway
Application servers
Available for scaling

Failure scenarios and automatic recovery:

Failure scenarioImpact without multi-AZImpact with multi-AZRecovery time
EC2 server failureComplete service outageAuto Scaling launches new server in healthy AZ2-5 minutes
Availability zone failureTotal system outageALB redirects traffic to healthy AZs~30 seconds
NAT Gateway failureLoss of outbound connectivityTraffic redirected to NAT in another AZ automaticallyInstantaneous
RDS database failureTotal data loss until restorationAutomatic failover to standby replica in another AZ60-120 seconds

Auto Scaling Groups for Self-Healing

Auto Scaling Groups continuously monitor EC2 instance health through Application Load Balancer health checks. When an instance fails or becomes unhealthy, the group automatically terminates it and launches a replacement in a healthy availability zone.

Multi-AZ Auto Scaling Group configuration:

Replacement strategies:

  • Balanced distribution: Auto Scaling maintains similar number of instances in each AZ.
  • Different AZ replacement: If a complete AZ fails, instances are replaced in healthy AZs.
  • Golden Image AMI: New instances launch from pre-configured AMI ensuring consistency.

Related Project

I worked on another series of projects where I specifically address automatic auto-scaling strategies, distributed instance groups, and zero-downtime deployment patterns implemented in high-availability architectures with global load balancing and AWS Global Accelerator.

View project: Auto-scaling and high availability strategies

Solution 6: Bastion Host and Controlled Administrative Access

Single Entry Point with Multifactor Authentication

The bastion host (also known as jump box) is a specially hardened EC2 instance located in a public subnet that functions as the single SSH entry point to the internal infrastructure. This design eliminates the need to assign public IPs to internal servers.

Bastion host security features:

  • Two-factor authentication (2FA): Combination of RSA key + mandatory Google Authenticator TOTP.
  • Role-based access control: User groups with specific permissions (admin, developer, external-ro, external-rw, sftp-only).
  • Complete session auditing: Recording and playback of all administrative activity through sudo logging and sudoreplay.
  • Encrypted SSH tunnels: Secure access to internal services (RDS, SFTP, Windows RDP) without exposing ports.
  • Restrictive Security Group: Only SSH port (non-standard) accessible from whitelisted corporate IPs.

Administrative access flow:

DiagramDiagram

Related Project

I worked on a series of projects specifically dedicated to the advanced implementation of bastion host with mandatory 2FA authentication, role-based access control using Linux groups, encrypted SSH tunnels for internal services, and complete session audit system with sudoreplay for regulatory compliance and full traceability of administrative access.

View project: AWS infrastructure security with advanced bastion host and 2FA

Solution 7: AWS Backup for Disaster Recovery

AWS Backup configuration showing retention policies and cross-region replication

Multi-Service Automated Backup Strategy

AWS Backup provides a centralized and automated service for managing backups of multiple AWS services with retention, encryption, and cross-region recovery policies.

Backed-up resources in the architecture:

Resource typeBackup frequencyRetentionSecondary destination
EC2 InstancesDaily at 02:00 UTC7 daily
4 weekly
6 monthly
Cross-region to secondary region
RDS AuroraContinuous (Point-in-Time)
+ Daily snapshot
7 automatic days
35 days Point-in-Time
Automatic cross-region replica
EFS VolumesDaily at 03:00 UTC7 daily
4 weekly
Replication to secondary region
DynamoDB TablesContinuous (Point-in-Time)35 days PITROptional cross-region replica

Automated backup policy:

The implemented backup strategy includes daily, weekly, and monthly backups with automated lifecycle management. Daily backups are retained for 7 days with transition to cold storage after 30 days, and automatic cross-region replication to Frankfurt. Weekly backups maintain 28-day retention, while monthly backups are kept for 180 days with cold storage transition after 90 days.

Disaster recovery process:

  1. EC2 recovery: AMIs generated by AWS Backup launch as new instances with identical configuration.
  2. RDS recovery: Restore from snapshot or Point-in-Time Recovery with second granularity.
  3. EFS recovery: Direct mounting of backup point or creation of new filesystem from snapshot.
  4. Cross-region recovery: In case of complete regional disaster, restoration from secondary region (Frankfurt).

Advantages of AWS Backup vs manual snapshots:

Complete automation: Elimination of human errors from forgetting scheduled manual backups.
Centralized management: Single dashboard for EC2, RDS, EFS, DynamoDB backups instead of separate consoles.
Regulatory compliance: Configurable retention policies that meet compliance requirements (GDPR, SOC 2, etc.).
Automatic cross-region: Replication to secondary region without additional scripts for geographic DR.

Solution 8: VPC Peering for Cross-Region Disaster Recovery

VPC Peering configuration between regions showing routing tables and traffic flow

Secure Connectivity Between AWS Regions

VPC Peering enables establishing private network connections between VPCs located in different AWS regions, creating a direct communication path that doesn’t traverse the public internet. This architecture was fundamental for implementing cross-region Disaster Recovery (DR) strategies for several clients with high availability and business continuity requirements.

Typical implementation scenario:

Primary regionSecondary region (DR)Peering purpose
eu-west-1 (Ireland)eu-central-1 (Frankfurt)Data replication, automatic failover, shared resource access
us-east-1 (Virginia)us-west-2 (Oregon)Cross-region backup, standby services, data synchronization

VPC Peering Security Features

Encrypted private trafficDirect communication over AWS private network without public internet exposure, with encryption in transit.
Granular routing controlSpecific routing tables determine exactly which subnets can communicate between regions.
Low cross-region latencyTypical Ireland-Frankfurt latency ~25ms, significantly better than VPN over public internet.
No single points of failureAWS-managed connection with automatic redundancy without need for manual high availability configuration.

Implemented VPC Peering Use Cases

Cross-region database replication: RDS Aurora with read replicas in Frankfurt enabling automatic failover in case of complete regional disaster in Ireland.
Cross-region EFS backup: EFS volume replication from Ireland to Frankfurt via AWS DataSync over VPC Peering for critical file recovery.
AMI synchronization: Automatic copy of production AMIs to DR region for rapid replacement instance launch in case of regional failure.
Cross-region administrative access: Bastion host in Ireland can access Frankfurt servers via VPC Peering without exposing additional public IPs.

Solution 9: Secure Hybrid Connectivity with On-Premise Infrastructure

Hybrid connectivity configuration showing Site-to-Site VPN and Direct Connect

Corporate Environment Integration with AWS

Several clients operated existing on-premise infrastructure that required secure and reliable connectivity with resources deployed on AWS. We implemented hybrid connectivity solutions using different technologies according to each client’s specific requirements: latency, bandwidth, regulatory compliance, and budget.

Implemented connectivity solutions:

TechnologyUse casesLatencyBandwidthCost
AWS Site-to-Site VPNCorporate offices, administrative access, non-critical traffic~50-100ms (over internet)Up to 1.25 GbpsLow
AWS Direct ConnectMassive data replication, latency-sensitive apps, compliance~10-20ms (dedicated)1-100 GbpsHigh
VPN over Direct ConnectMaximum security for regulated data, end-to-end encryption~15-25ms1-10 GbpsMedium-High

For clients with existing on-premise infrastructure, I implemented secure connectivity between corporate data centers and AWS using two approaches based on their specific requirements:

AWS Site-to-Site VPN was deployed for clients requiring encrypted connectivity over the internet. This solution established IPsec tunnels between the Customer Gateway (client’s on-premise router) and the Virtual Private Gateway in AWS, providing redundant tunnels and BGP routing for automatic failover.

AWS Direct Connect was implemented for clients with high bandwidth requirements (1-10 Gbps) or regulatory compliance mandating that data not transit public internet. This dedicated fiber connection provided consistent low latency and predictable performance for database replication and latency-sensitive applications.

For clients requiring both bandwidth and end-to-end encryption, I configured hybrid architectures combining Direct Connect as the high-speed primary connection with Site-to-Site VPN layered on top for encryption, ensuring compliance with GDPR and HIPAA requirements.

Results and Business Impact

The implemented architecture transformed the security posture and operational resilience of these infrastructures:

Security improvements: Complete elimination of public IPs on internal servers and databases drastically reduced the attack surface. AWS Shield and WAF provided automatic protection against DDoS attacks and common web exploits (SQL injection, XSS). The multilevel network segmentation established physical and logical separation between front-end and back-end services.

High availability: Multi-AZ deployment eliminated single points of failure with automatic failover capabilities. Auto Scaling Groups provided self-healing for EC2 instances, while RDS Aurora maintained automatic database failover to standby replicas. Cross-region backup replication ensured recovery capabilities even in complete regional disasters.

Compliance: The architecture aligned with GDPR, SOC 2, ISO 27001, and PCI DSS requirements through documented technical controls. VPC Flow Logs enabled complete network traffic capture for forensic analysis, while the bastion host with 2FA provided traceable administrative access.

Key Technical Achievements

Design and implementation of multilevel VPC architecture with strict separation of public/private subnets distributed across 3 AZs.
Configuration of multi-AZ redundant NAT Gateway with subnet-specific routing tables for granular traffic control.
Implementation of AWS Shield and WAF with custom anti-exploit rules (SQL injection, XSS, rate limiting).
Design of layered Security Groups and Network ACLs implementing defense in depth with least privilege principle.
Configuration of automated AWS Backup with retention policies and cross-region replication for DR.
Deployment of hardened bastion host with 2FA as single administrative entry point (detailed in another series of projects from my portfolio).
Implementation of cross-region VPC Peering (Ireland-Frankfurt) for Disaster Recovery with data replication and automatic failover.
Configuration of hybrid connectivity via Site-to-Site VPN and AWS Direct Connect for secure integration with on-premise infrastructure.

Lessons Learned

What worked exceptionally well:

  1. Physical public/private subnet separation: Strict network segmentation completely eliminated direct attack vectors to critical internal services.
  2. NAT Gateway per AZ: Deploying independent NAT Gateway in each availability zone avoided costly cross-AZ traffic and single points of failure.
  3. Centralized AWS Backup: Unified backup management for EC2, RDS, and EFS simplified operations and ensured consistency in retention policies.
  4. Security Groups by reference: Using Security Group references instead of specific IPs facilitated scaling and maintenance without constantly modifying rules.
  5. VPC Flow Logs from day one: Enabling network flow logging from the start provided complete visibility for troubleshooting and security analysis.

Technical challenges overcome:

  1. Zero-downtime migration: Required cutover strategy with reduced DNS TTL and data synchronization to migrate services from old architecture to new VPC without interruptions.
  2. Routing table complexity: Requires exhaustive documentation and traffic flow diagrams to avoid configuration errors causing connectivity loss.
  3. NAT Gateway costs: NAT Gateway has hourly cost and per-GB processed. Implemented traffic analysis to optimize usage and avoid billing surprises.
  4. Initial WAF false positives: WAF rules generated some legitimate blocks initially, requiring fine-tuning based on real traffic patterns.
  5. Corporate static IP management: Clients with dynamic IPs for bastion access required implementation of dynamic Security Group update solution via Lambda.

Conclusion

This AWS network security transformation project represents a complete case study on how to redesign vulnerable and monolithic architectures into resilient, secure, and scalable enterprise infrastructures. By implementing a multilevel VPC with strict separation between public and private subnets, redundant NAT Gateway for outbound traffic control, multi-AZ deployment for high availability, AWS Shield and WAF for perimeter protection, and AWS Backup for disaster recovery, these organizations completely transformed their security posture.

The resulting architecture not only eliminates single points of failure and drastically reduces the attack surface but also establishes a defense-in-depth model with multiple overlapping security layers protecting critical assets from the perimeter to the most sensitive internal services. This architectural approach is applicable to any organization operating infrastructure on AWS and requiring security, availability, and regulatory compliance guarantees.

Key Conclusions for Similar Projects

  1. Network segmentation is fundamental: Separation between public and private subnets should be the first step in any production AWS architecture.
  2. NAT Gateway vs NAT Instance: Using managed NAT Gateway instead of self-managed NAT Instances eliminates maintenance operations and ensures availability.
  3. Multi-AZ is not optional: Deploying critical resources in multiple availability zones should be a baseline requirement, not a later optimization.
  4. Layered Security Groups: Implementing service role-specific security groups facilitates maintenance and complies with least privilege principle.
  5. WAF requires fine-tuning: Generic WAF rules can generate false positives; they require calibration based on legitimate traffic patterns specific to each application.
  6. Automated AWS Backup from start: Configuring automated backups from day one prevents data loss from unexpected incidents.
  7. Network flow documentation: Maintaining updated diagrams of traffic flows and routing tables is critical for rapid troubleshooting.

Need to transform your AWS infrastructure with enterprise security?

If your organization faces similar challenges:

  • Servers with public IPs directly exposed to internet without perimeter protection.
  • Single-zone architecture vulnerable to complete service availability failures.
  • No DDoS protection or WAF leaving applications vulnerable to volumetric attacks and exploits.
  • Compliance requirements (GDPR, SOC 2, ISO 27001, PCI DSS) without documented technical controls.
  • Absence of DR strategy with risk of catastrophic data loss.

As an AWS cloud architect with 20+ years of infrastructure security experience, I can help you design and implement enterprise architectures that combine multilevel security, high availability, and regulatory compliance without compromising operability.

Specialized in multilevel VPC, NAT Gateway, AWS Shield/WAF, multi-AZ architectures, and backup strategies for disaster recovery.

Get in touch →

Daniel López Azaña

About the author

Daniel López Azaña

Tech entrepreneur and cloud architect with over 20 years of experience transforming infrastructures and automating processes.

Specialist in AI/LLM integration, Rust and Python development, and AWS & GCP architecture. Restless mind, idea generator, and passionate about technological innovation and AI.

Comments

Be the first to comment

Submit comment

Have a Similar Project in Mind?

Let's discuss how I can help you achieve your goals

Start a Conversation