Case Studies: Elastic Load Balancing - ALB, NLB, GWLB
Case Study 1
A financial services company is migrating a high-frequency trading application to AWS. The application requires extremely low latency and processes millions of TCP connections per second from trading partners. Each connection carries market data that must be delivered with sub-millisecond latency. The application backend runs on a fleet of EC2 instances in a single Availability Zone for performance reasons, but the company wants to implement load balancing without introducing additional network hops or latency. The trading application uses a proprietary TCP protocol that includes connection state information, and clients expect to reconnect to the same backend server if a connection is temporarily interrupted. The application has no HTTP layer.
Which load balancing solution should the solutions architect recommend to meet these requirements?
- Deploy an Application Load Balancer with connection draining enabled and configure sticky sessions based on application-generated cookies to ensure clients reconnect to the same target.
- Deploy a Network Load Balancer with cross-zone load balancing disabled and enable client IP preservation to maintain the source IP address of trading partners.
- Deploy a Gateway Load Balancer to distribute traffic across the EC2 instances and use GENEVE protocol encapsulation to preserve connection state information.
- Deploy multiple Network Load Balancers, one per Availability Zone, and use Route 53 latency-based routing to distribute traffic across the load balancers.
Answer & Explanation
Correct Answer: 2 - Deploy a Network Load Balancer with cross-zone load balancing disabled and enable client IP preservation
Why this is correct: Network Load Balancer operates at Layer 4 (TCP/UDP) and is specifically designed for ultra-low latency, high-throughput scenarios with millions of requests per second. It preserves the client source IP address, which is critical for trading applications that may implement IP-based authentication or logging. With cross-zone load balancing disabled and all targets in a single AZ, the NLB minimizes latency by avoiding cross-AZ traffic. NLB supports connection-based stickiness using flow hash algorithm, ensuring clients reconnect to the same target. It introduces minimal latency overhead compared to ALB because it operates at the connection level without deep packet inspection.
Why the other options are wrong:
- Option 1: Application Load Balancer operates at Layer 7 (HTTP/HTTPS) and requires HTTP protocol. This trading application uses a proprietary TCP protocol with no HTTP layer, making ALB incompatible. ALB also introduces higher latency due to HTTP parsing and routing logic, which violates the sub-millisecond latency requirement.
- Option 3: Gateway Load Balancer is designed for deploying, scaling, and managing third-party virtual network appliances such as firewalls, intrusion detection systems, and deep packet inspection systems. It uses GENEVE encapsulation to send traffic to and from appliances. This is not a load balancing solution for application traffic distribution and would introduce unnecessary encapsulation overhead and latency.
- Option 4: Deploying multiple NLBs across Availability Zones contradicts the stated architecture where backend instances run in a single AZ for performance reasons. Route 53 latency-based routing introduces DNS resolution delays and doesn't address the requirement for low-latency load balancing within the existing single-AZ deployment. This adds unnecessary complexity without meeting the constraints.
Key Insight: The protocol requirement (TCP vs HTTP) immediately eliminates ALB, while understanding GWLB's specific use case (network appliance insertion) versus general load balancing distinguishes candidates who know when NOT to use a service. The single-AZ constraint combined with latency requirements points to NLB with cross-zone disabled.
Case Study 2
A healthcare technology company operates a patient portal that serves 2 million users across North America. The application consists of microservices running in containers on Amazon ECS with Fargate. Different microservices handle authentication, patient records, appointment scheduling, and billing. The company needs to route incoming requests to the appropriate microservice based on the URL path: /auth/* should route to authentication services, /records/* to patient record services, and /billing/* to billing services. The company also needs to implement user session persistence based on cookies, ensure end-to-end encryption with TLS termination at the load balancer, and support WebSocket connections for real-time appointment notifications. Compliance requirements mandate that all HTTP traffic be automatically redirected to HTTPS.
Which load balancing solution provides all required capabilities with the least operational overhead?
- Deploy a Network Load Balancer with TLS listeners and use path-based routing rules to direct traffic to different target groups based on URL paths, then configure cookie-based sticky sessions.
- Deploy an Application Load Balancer with HTTPS listeners, configure path-based routing rules to route requests to different target groups, enable sticky sessions using application-based cookies, and create a redirect rule from HTTP to HTTPS.
- Deploy a Classic Load Balancer with HTTPS listeners and configure multiple listeners for different URL paths, then use X-Forwarded-For headers to maintain session persistence.
- Deploy a Gateway Load Balancer to inspect incoming traffic and route to appropriate services, then deploy separate Network Load Balancers for each microservice to handle TLS termination and session persistence.
Answer & Explanation
Correct Answer: 2 - Deploy an Application Load Balancer with HTTPS listeners and path-based routing
Why this is correct: Application Load Balancer operates at Layer 7 and natively supports all required features: path-based routing to direct requests to different target groups based on URL patterns, TLS termination with HTTPS listeners, cookie-based sticky sessions (both application-controlled and load balancer-generated cookies), WebSocket protocol support, and HTTP to HTTPS redirect rules configured directly in the listener rules. This provides all functionality with minimal configuration and operational overhead. ALB is specifically designed for microservices architectures requiring content-based routing.
Why the other options are wrong:
- Option 1: Network Load Balancer operates at Layer 4 and does not support path-based routing or content-based routing decisions. NLB routes traffic based on IP protocol data (TCP/UDP ports and flow hash) and cannot inspect URL paths to make routing decisions. While NLB supports TLS termination, it lacks the Layer 7 features needed for this microservices routing pattern. Additionally, NLB does not support HTTP to HTTPS redirects.
- Option 3: Classic Load Balancer is the legacy load balancing option and does not support path-based routing, advanced request routing, or native HTTP to HTTPS redirect rules. While CLB supports sticky sessions, it cannot route different URL paths to different target groups. Achieving the required routing would require implementing path-based logic in the application layer, increasing complexity and violating the least operational overhead constraint.
- Option 4: Gateway Load Balancer is designed for transparent network appliance insertion (firewalls, IDS/IPS), not application traffic routing. Combining GWLB with multiple NLBs creates unnecessary architectural complexity, and NLBs still cannot perform path-based routing. This solution introduces significant operational overhead managing multiple load balancers and doesn't address the core routing requirement.
Key Insight: Path-based routing and content-aware routing decisions are Layer 7 capabilities exclusive to ALB. Candidates must recognize that NLB, despite supporting TLS and high performance, cannot inspect HTTP headers or URL paths to make routing decisions. The microservices pattern with URL-based routing is a classic ALB use case.
Case Study 3
A video streaming company runs a content delivery platform on AWS with origin servers hosted on EC2 instances behind a load balancer. The company recently implemented a new security architecture that requires all inbound traffic to pass through third-party network security appliances from two different vendors (a firewall appliance and a deep packet inspection appliance) before reaching the application servers. Both appliances must inspect every packet in sequence, and the appliances are deployed as clusters of virtual appliances running on EC2 instances for high availability. The security team requires that traffic flow be transparent to both the security appliances and the origin servers, meaning source and destination IP addresses must be preserved. The solution must automatically scale the security appliances and distribute traffic evenly across all appliance instances while maintaining high availability across multiple Availability Zones.
What combination of AWS services should the solutions architect implement? (Select TWO)
- Deploy a Gateway Load Balancer to distribute traffic to the security appliance clusters and use Gateway Load Balancer Endpoints in the VPC to integrate with the application architecture.
- Deploy an Application Load Balancer in front of the security appliances and configure custom routing rules to forward traffic through both appliance clusters sequentially.
- Configure VPC Traffic Mirroring to send copies of all traffic to the security appliances for inspection before forwarding to the origin servers.
- Use AWS Transit Gateway to route traffic through the security appliances deployed in a dedicated inspection VPC, then route to the application VPC.
- Configure the security appliances to use GENEVE protocol encapsulation to receive traffic from the Gateway Load Balancer and maintain flow stickiness to the same appliance instances.
Answer & Explanation
Correct Answer: 1 and 5 - Deploy a Gateway Load Balancer with GENEVE protocol encapsulation
Why these are correct: Gateway Load Balancer is specifically designed for transparently inserting third-party virtual network appliances into the traffic path. GWLB distributes traffic across security appliance instances using flow-based hash algorithms to ensure symmetric flow (same appliance sees both directions of a connection). Gateway Load Balancer Endpoints (GWLBe) act as VPC endpoints that intercept traffic and send it to the GWLB, enabling seamless integration without complex routing changes. GENEVE protocol encapsulation is the mechanism GWLB uses to send traffic to appliances while preserving original source and destination IP addresses, satisfying the transparency requirement. Flow stickiness ensures packets from the same flow consistently go to the same appliance instance, which is critical for stateful inspection.
Why the other options are wrong:
- Option 2: Application Load Balancer operates at Layer 7 and is designed for HTTP/HTTPS application traffic routing, not transparent network traffic inspection. ALB modifies packets by terminating connections and creating new connections to backends, which violates the requirement to preserve source and destination IP addresses. ALB cannot transparently insert network appliances into the traffic path or support arbitrary protocols needed by security appliances.
- Option 3: VPC Traffic Mirroring creates copies of traffic for monitoring and analysis purposes, sending duplicated packets to monitoring appliances. It does not insert appliances into the actual traffic flow and cannot enforce that traffic must pass through security appliances before reaching origin servers. Mirrored traffic is observational only; the original traffic flows directly to destinations without inspection-based enforcement.
- Option 4: While Transit Gateway can be used to route traffic between VPCs and through inspection VPCs, this approach requires complex routing table configurations and does not provide automatic load balancing across multiple security appliance instances within the inspection VPC. Transit Gateway alone doesn't distribute traffic evenly across appliance cluster members or provide health checking and automatic failover for individual appliance instances. This solution lacks the load balancing component needed for high availability and even distribution.
Key Insight: Gateway Load Balancer is purpose-built for this exact use case-transparent insertion of third-party security appliances. The key differentiator is understanding that GWLB preserves flow symmetry and IP addresses using GENEVE encapsulation, while ALB and NLB are designed for application traffic distribution with different characteristics. Many candidates miss that GENEVE is the protocol mechanism enabling transparency.
Case Study 4
An e-commerce company is experiencing intermittent connection timeout errors from mobile app users during flash sales. The application architecture consists of a Network Load Balancer distributing traffic to an Auto Scaling group of EC2 instances running a REST API. During investigation, the operations team discovered that when flash sales begin, the number of concurrent connections spikes from 50,000 to 2 million within 30 seconds. Users report that new connections fail during the first 2-3 minutes of the sale, but users who successfully connected earlier maintain their connections without issues. The Auto Scaling group successfully scales from 10 to 100 instances within 90 seconds, and CloudWatch metrics show that instance CPU utilization never exceeds 40% during flash sales. The Network Load Balancer is configured with connection idle timeout of 350 seconds and target group health checks running every 10 seconds.
What is the MOST likely cause of the connection timeout errors?
- The Network Load Balancer has not been pre-warmed to handle the sudden traffic spike, and the load balancer nodes are unable to scale quickly enough to handle the connection rate increase.
- The connection idle timeout setting of 350 seconds is causing the load balancer to maintain too many idle connections, exhausting available connection slots for new incoming connections.
- The health check interval of 10 seconds is too long, causing the load balancer to continue sending traffic to instances that are unhealthy or not fully initialized after scaling.
- The Auto Scaling group is using target tracking scaling policy based on CPU utilization, which doesn't respond quickly enough to sudden connection spikes that occur faster than the 90-second scaling completion time.
Answer & Explanation
Correct Answer: 1 - The Network Load Balancer has not been pre-warmed
Why this is correct: Network Load Balancers automatically scale to handle traffic increases, but this scaling process takes time (typically minutes). When traffic increases suddenly from 50,000 to 2 million connections in 30 seconds, the NLB nodes may not scale quickly enough to handle the new connection rate, resulting in connection timeouts for new connections during the scaling period. The fact that users who connected early maintain connections indicates the backend instances are healthy and capable, but new connections cannot be accepted because the NLB itself is at capacity. For predictable traffic spikes like planned flash sales, AWS recommends pre-warming the load balancer by contacting AWS Support, which provisions additional load balancer capacity in advance. The symptom pattern-early connections succeed, new connections during the spike fail, then stability returns after a few minutes-is classic NLB scaling behavior.
Why the other options are wrong:
- Option 2: The connection idle timeout setting affects how long the load balancer maintains idle connections before closing them, but a 350-second timeout is within normal parameters and would not prevent new connections from being established. NLBs can handle millions of concurrent connections, and idle connections do not exhaust connection slots. The issue manifests during new connection establishment, not during idle connection management, and existing connections remain stable.
- Option 3: A 10-second health check interval is reasonable and within AWS recommended ranges (6-300 seconds). The scenario states that instances successfully scale to 100 instances and CPU utilization remains at 40%, indicating instances are healthy and underutilized. If unhealthy targets were the issue, we would see connection failures distributed across the entire duration and eventual server errors, not specifically during the initial spike period with recovery after a few minutes.
- Option 4: While the 90-second scaling time creates a temporary capacity gap, the scenario explicitly states that CPU utilization never exceeds 40%, meaning the instances themselves have plenty of capacity to handle the traffic. The bottleneck is not at the instance layer. Additionally, the Auto Scaling group does successfully scale to 100 instances, yet connection timeouts still occur during the first 2-3 minutes, indicating the problem persists even after instances are available. This points to the load balancer layer, not the Auto Scaling response time.
Key Insight: Understanding that load balancers themselves have scaling limits is critical. Many candidates assume load balancers have infinite immediate capacity and focus exclusively on backend scaling. The time-bound nature of the problem (first 2-3 minutes only) combined with healthy, underutilized backends points to the NLB scaling lag. Pre-warming is a real operational practice for planned traffic events.
Case Study 5
A logistics company operates a real-time package tracking system that receives GPS coordinates from delivery vehicles every 2 seconds. The system processes approximately 500,000 UDP packets per second from 50,000 delivery vehicles across multiple regions. Each UDP packet contains location data and must be processed by the same backend server for the duration of a delivery route to maintain consistent geospatial calculations. The backend consists of stateful processing servers running on EC2 instances that maintain routing algorithms in memory. The company requires the solution to preserve the source IP address of each vehicle for security logging and requires the lowest possible latency for location updates. The company wants to minimize costs while maintaining the required performance characteristics.
Which load balancing solution meets all requirements at the lowest cost?
- Deploy an Application Load Balancer with UDP listener support, configure sticky sessions based on source IP address, and enable access logs to S3 for security logging of source IP addresses.
- Deploy a Network Load Balancer with UDP listeners, enable cross-zone load balancing for even distribution, and configure flow-based stickiness to ensure packets from the same vehicle reach the same backend instance.
- Deploy multiple Classic Load Balancers configured for UDP traffic with connection draining enabled to handle vehicle disconnections gracefully and preserve source IP addresses.
- Deploy a Gateway Load Balancer with GENEVE encapsulation to preserve source IP addresses and distribute UDP traffic across backend instances while maintaining flow consistency.
Answer & Explanation
Correct Answer: 2 - Deploy a Network Load Balancer with UDP listeners
Why this is correct: Network Load Balancer natively supports UDP protocol at Layer 4, can handle millions of requests per second with ultra-low latency, and automatically preserves source IP addresses without any additional configuration. NLB uses a flow hash algorithm (based on protocol, source IP, source port, destination IP, destination port) to implement flow stickiness, ensuring all packets from the same vehicle (same source IP and port combination) are consistently routed to the same target instance, which is critical for maintaining stateful geospatial calculations. NLB operates at the lowest cost among modern load balancers for high-throughput scenarios because it's charged per Load Balancer Capacity Unit (LCU), and UDP traffic is generally less expensive than HTTP/HTTPS processing. Cross-zone load balancing can be enabled for distribution across Availability Zones.
Why the other options are wrong:
- Option 1: Application Load Balancer operates exclusively at Layer 7 and only supports HTTP, HTTPS, and WebSocket protocols. ALB does not support UDP listeners and cannot process UDP packets. This makes ALB completely incompatible with the UDP-based vehicle tracking protocol. Additionally, ALB sticky sessions are designed for HTTP cookies, not UDP flows, and ALB would introduce significantly higher latency due to Layer 7 processing.
- Option 3: Classic Load Balancer supports TCP and HTTP/HTTPS protocols but does not support UDP traffic. CLB can operate at Layer 4 for TCP traffic, but the vehicle tracking system explicitly uses UDP packets for location updates. CLB cannot process UDP datagrams, making it unsuitable for this use case. Additionally, CLB is a legacy service with higher operational overhead and lacks modern features.
- Option 4: Gateway Load Balancer is designed for deploying third-party virtual network appliances (firewalls, IDS/IPS systems) in the traffic path, not for general-purpose application load balancing. While GWLB can handle UDP traffic and preserves source IP through GENEVE encapsulation, it introduces unnecessary overhead and complexity for this use case. GWLB is more expensive than NLB because it's intended for specialized security appliance integration scenarios. Using GWLB for simple UDP load distribution would increase costs without providing value, violating the cost minimization requirement.
Key Insight: Protocol support is the primary eliminating factor-only NLB and GWLB support UDP among modern load balancers. The distinction between GWLB and NLB comes down to use case: GWLB is for inserting network appliances transparently, while NLB is for distributing application traffic. Cost-consciousness should guide candidates toward NLB for straightforward traffic distribution rather than the specialized GWLB.
Case Study 6
A media streaming company is launching a live sports broadcasting service that will serve millions of concurrent viewers during major sporting events. The application uses HTTP Live Streaming (HLS) protocol to deliver video segments to viewers. Each viewer's session lasts an average of 2 hours and consists of sequential requests for small video segment files every 6 seconds. The company's content delivery architecture includes CloudFront distributions in front of origin servers running on EC2 instances. The solutions architect needs to select a load balancer to sit between CloudFront and the origin servers. The company requires detailed access logs showing viewer request patterns for analytics, needs to implement custom HTTP headers to pass viewer geographic information to origin servers for regional content customization, and wants to minimize data transfer costs between the load balancer and CloudFront. SSL/TLS termination must occur at the load balancer level.
Which solution meets these requirements while optimizing for cost?
- Deploy a Network Load Balancer with TLS listeners to terminate SSL connections, configure static IP addresses for CloudFront to connect to, and enable VPC Flow Logs to capture viewer request patterns for analytics.
- Deploy an Application Load Balancer with HTTPS listeners, configure custom HTTP headers using listener rules to add geographic information, enable access logging to S3, and enable connection draining for graceful handling of long viewer sessions.
- Deploy a Gateway Load Balancer to inspect HTTP traffic from CloudFront, add custom headers containing viewer information, and route requests to origin servers while maintaining connection state for streaming sessions.
- Deploy a Classic Load Balancer with HTTPS listeners and configure X-Forwarded-For headers to preserve viewer IP addresses, then use CloudWatch Logs to capture access patterns.
Answer & Explanation
Correct Answer: 2 - Deploy an Application Load Balancer with HTTPS listeners and custom headers
Why this is correct: Application Load Balancer provides all required capabilities: native HTTPS listener support for TLS termination, ability to insert custom HTTP headers using listener rules (including geographic and viewer information), comprehensive access logging to S3 that captures detailed request information for analytics, and connection draining to handle long-lived streaming sessions gracefully. ALB can add custom headers to requests before forwarding to targets, enabling the origin servers to receive viewer geographic information without application code changes. ALB access logs include rich request-level data including paths, user agents, response codes, and processing times, which are essential for streaming analytics. Data transfer from ALB to CloudFront origins within the same region incurs standard data transfer charges, which are minimized by using AWS internal networking.
Why the other options are wrong:
- Option 1: While Network Load Balancer supports TLS termination and can provide static IP addresses that CloudFront can use, NLB operates at Layer 4 and cannot inspect or modify HTTP headers. NLB cannot add custom HTTP headers containing geographic information because it doesn't process HTTP protocol data-it simply forwards TCP packets. Additionally, VPC Flow Logs capture network-level information (source/destination IPs, ports, protocols, byte counts) but not application-level request details like URLs, user agents, or request patterns needed for meaningful streaming analytics. NLB access logs are limited compared to ALB's detailed HTTP request logging.
- Option 3: Gateway Load Balancer is designed for transparent insertion of third-party virtual network appliances for security inspection and traffic analysis, not for origin load balancing in content delivery architectures. GWLB uses GENEVE encapsulation to send traffic to appliances and expects appliances to perform inspection and return traffic. It is not designed to add custom HTTP headers or perform application-level traffic routing. Using GWLB for this scenario introduces unnecessary complexity, cost, and architectural overhead without providing the required HTTP header manipulation capability.
- Option 4: Classic Load Balancer is the legacy load balancing option that lacks modern features like custom HTTP header insertion through configuration. While CLB supports X-Forwarded-For headers to preserve original client IP addresses, it cannot add arbitrary custom headers containing geographic information without application-level modifications. CLB does not provide native access logging to S3 with the same level of detail as ALB; instead, it requires CloudWatch Logs integration, which is less cost-effective and less feature-rich for HTTP request analysis. CLB also has higher operational overhead and is not recommended for new implementations.
Key Insight: The requirement to add custom HTTP headers immediately points to Layer 7 capabilities that only ALB provides among the load balancer options. Candidates must recognize that Layer 4 load balancers (NLB) cannot inspect or modify HTTP headers because they operate below the HTTP protocol layer. The detailed logging requirement for analytics further reinforces ALB as the appropriate choice.
Case Study 7
A multinational corporation is migrating its SAP ERP system to AWS. The SAP application uses a three-tier architecture with web servers, application servers, and database servers. The SAP application servers communicate with the database tier using a proprietary protocol over TCP port 3200, and these connections must maintain session state for up to 8 hours during long-running batch processes. The application servers are distributed across three Availability Zones for high availability. The company's network team requires that the source IP addresses of application servers be preserved when connecting to the database tier for security audit logging and IP-based access control lists configured on the database instances. The database team reports that the SAP database performs best when connections from the same application server consistently reach the same database instance to take advantage of database connection pooling and cached query plans. The company cannot modify the SAP application code or database configuration.
What load balancing configuration should be implemented between the application tier and database tier?
- Deploy an Application Load Balancer with TCP listener support on port 3200, configure sticky sessions using application-generated cookies, and set connection idle timeout to 8 hours to support long-running batch processes.
- Deploy a Network Load Balancer with a TCP listener on port 3200, configure the target group with connection termination disabled to preserve source IP addresses, and enable cross-zone load balancing to distribute connections across database instances in all Availability Zones.
- Deploy a Network Load Balancer with client IP preservation enabled, configure flow-based stickiness to ensure consistent routing from the same source to the same target, and set connection idle timeout to accommodate 8-hour sessions.
- Deploy a Gateway Load Balancer to transparently route database traffic while preserving source IP addresses, and configure GENEVE encapsulation to maintain connection state for long-running sessions.
Answer & Explanation
Correct Answer: 3 - Deploy a Network Load Balancer with client IP preservation and flow-based stickiness
Why this is correct: Network Load Balancer operating at Layer 4 natively preserves source IP addresses without additional configuration, meeting the security audit and IP-based ACL requirements. NLB's flow hash algorithm automatically provides flow-based stickiness, ensuring that packets from the same source IP and port combination consistently route to the same target (database instance), which satisfies the requirement for consistent routing to optimize database connection pooling and query plan caching. NLB supports long-lived TCP connections and allows configuration of connection idle timeout up to 4000 seconds (over 1 hour), and can maintain TCP connections for extended periods when there is active traffic. The proprietary TCP protocol on port 3200 is fully supported by NLB's Layer 4 operation without requiring HTTP protocol processing.
Why the other options are wrong:
- Option 1: Application Load Balancer operates at Layer 7 and requires HTTP or HTTPS protocols. ALB does not support arbitrary TCP protocols or custom ports in the way required for the SAP proprietary database protocol on port 3200. While ALB can listen on custom ports, it expects HTTP protocol traffic and cannot handle non-HTTP TCP protocols. Additionally, ALB sticky sessions use HTTP cookies, which cannot be applied to non-HTTP proprietary protocols. ALB also terminates and re-establishes connections, which prevents source IP preservation without using X-Forwarded-For headers that only work for HTTP protocols.
- Option 2: While Network Load Balancer is the correct service choice and does preserve source IP addresses, this option contains a technical inaccuracy: "connection termination disabled" is not a configurable parameter on NLB. NLB by default does not terminate connections at Layer 4-it operates in pass-through mode preserving the original connection attributes. The more significant issue is that this option does not address the requirement for consistent routing from the same application server to the same database instance. Without explicit configuration of flow-based stickiness understanding, enabling cross-zone load balancing alone doesn't guarantee consistent routing behavior needed for database connection pooling optimization.
- Option 4: Gateway Load Balancer is designed for inserting third-party virtual network appliances (firewalls, IDS/IPS) into the network path, not for load balancing database traffic in a three-tier application architecture. GWLB uses GENEVE encapsulation to send traffic to appliances and expects the traffic to be processed by network appliances before being returned to the flow. There are no network appliances in this scenario, and GWLB would introduce unnecessary complexity and cost. While GWLB can preserve source IPs, it's architecturally inappropriate for internal tier-to-tier load balancing in an application stack.
Key Insight: The proprietary TCP protocol requirement immediately eliminates ALB, which requires HTTP/HTTPS. The source IP preservation requirement combined with flow consistency points directly to NLB's native capabilities. Many candidates confuse "sticky sessions" (an HTTP/application-layer concept) with "flow stickiness" (a transport-layer concept based on connection tuples), which is the key distinction here.
Case Study 8
A software-as-a-service company provides a project management application to enterprise customers. Each customer is assigned to a dedicated set of EC2 instances for data isolation and compliance requirements. The company uses a single Application Load Balancer to route incoming requests to the appropriate customer-specific target group based on the subdomain in the request (e.g., customer-a.example.com routes to Customer A's instances, customer-b.example.com routes to Customer B's instances). The company currently has 80 enterprise customers and expects to grow to 500 customers within the next year. Each customer's environment consists of 3-10 EC2 instances in a target group. The security team requires mutual TLS authentication for all customer connections, with each customer using their own TLS certificate for authentication. The company wants to minimize operational overhead while maintaining the isolation and routing requirements.
Which approach should the solutions architect recommend to meet these requirements as the customer base scales?
- Continue using a single Application Load Balancer and configure host-based routing rules for each customer subdomain, associating each rule with the customer's dedicated target group, and upload each customer's TLS certificate to AWS Certificate Manager for mutual TLS authentication.
- Deploy a separate Application Load Balancer for each customer to provide complete isolation, configure each ALB with the customer's TLS certificate for mutual TLS, and use Route 53 alias records to route each customer subdomain to their dedicated ALB.
- Deploy a Network Load Balancer with TLS listeners and configure SNI (Server Name Indication) to support multiple TLS certificates, then use host-based routing at the application layer to direct requests to appropriate customer instances.
- Implement AWS Global Accelerator with multiple Application Load Balancers as endpoints, using custom routing to direct each customer to their dedicated ALB based on subdomain, and configure mutual TLS at the Global Accelerator layer.
Answer & Explanation
Correct Answer: 1 - Continue using a single Application Load Balancer with host-based routing rules
Why this is correct: A single Application Load Balancer can support up to 100 target groups and up to 100 rules per load balancer, which is sufficient for 500 customers if each customer requires one routing rule and one target group. ALB supports host-based routing using the HTTP Host header (subdomain), allowing configuration of routing rules that direct requests for customer-a.example.com to Target Group A, customer-b.example.com to Target Group B, and so forth. ALB supports multiple TLS certificates using Server Name Indication (SNI), allowing each customer subdomain to use a different certificate for mutual TLS authentication-certificates are stored in AWS Certificate Manager and associated with the appropriate listeners. This approach minimizes operational overhead by managing a single load balancer infrastructure rather than hundreds of separate load balancers, reducing costs, simplifying monitoring, and centralizing management while still maintaining logical isolation through target groups.
Why the other options are wrong:
- Option 2: Deploying a separate ALB for each customer would result in 500 Application Load Balancers at scale, creating massive operational overhead for management, monitoring, updating, and cost control. Each ALB has a fixed hourly cost plus LCU charges, making this approach extremely expensive (500x the base hourly cost). Managing 500 separate ALBs for TLS certificate rotation, security updates, and configuration changes would be operationally unsustainable. While this provides strong isolation, it directly violates the "minimize operational overhead" requirement and is cost-prohibitive compared to using a single ALB with routing rules.
- Option 3: Network Load Balancer supports TLS termination and SNI for multiple certificates, but NLB operates at Layer 4 and does not support host-based routing or HTTP header inspection. NLB cannot examine the HTTP Host header to make routing decisions based on subdomains. Pushing host-based routing logic to the application layer means every customer's instances would receive all traffic, and the application code would need to handle routing and rejection of misrouted requests. This increases application complexity, violates the data isolation requirement (traffic touches the wrong customer's instances before being rejected), and creates security concerns.
- Option 4: AWS Global Accelerator is designed to improve global application availability and performance by routing traffic through the AWS global network to optimal regional endpoints. Global Accelerator does not support content-based routing or subdomain-based routing; it routes based on geographic proximity and endpoint health. Global Accelerator cannot perform mutual TLS authentication or route traffic based on HTTP Host headers. This solution introduces unnecessary cost and complexity without providing the required host-based routing capability, and it does not address the core routing requirement. Additionally, custom routing in Global Accelerator refers to routing traffic to specific EC2 instances behind an ALB, not routing based on application-layer attributes like subdomains.
Key Insight: ALB's ability to support multiple target groups and rules, combined with SNI for multiple TLS certificates, makes it highly scalable for multi-tenant architectures without requiring separate load balancer instances per tenant. Candidates who immediately think "one load balancer per customer" for isolation miss the cost and operational implications at scale. Understanding ALB's routing rule capacity limits and SNI capabilities is essential.
Case Study 9
A genomics research institute processes large-scale DNA sequencing data using a distributed computing cluster on AWS. The compute cluster consists of 200 high-memory EC2 instances that run analysis jobs submitted by researchers. Each analysis job is submitted via a web interface and requires persistent TCP connections that can last between 6 to 72 hours as the computation progresses. The web application runs on a separate tier of web servers behind a load balancer. Researchers connect to the web interface over HTTPS and must maintain their session state during the entire analysis job duration. The institute experiences unpredictable traffic patterns, with usage varying from 20 concurrent users during nights and weekends to 300 concurrent users during peak research periods. The security team requires end-to-end encryption for all data in transit and must terminate TLS at the load balancer for compliance inspection purposes. The institute wants to optimize costs while maintaining support for the long-lived connections.
Which load balancing configuration is MOST cost-effective while meeting all technical requirements?
- Deploy a Network Load Balancer with TLS listeners for encryption termination, enable connection draining with a 72-hour timeout to support long-running jobs, and configure static IP addresses for simplified DNS management.
- Deploy an Application Load Balancer with HTTPS listeners, configure sticky sessions using load balancer-generated cookies to maintain session state, set connection idle timeout to the maximum allowed value, and enable deletion protection to prevent accidental removal during long-running sessions.
- Deploy an Application Load Balancer with multiple HTTPS listeners for different analysis job types, configure Lambda functions as targets to route long-running connections to the appropriate compute instances, and enable AWS WAF for security inspection.
- Deploy two Network Load Balancers in different Availability Zones with TLS termination, use Route 53 health checks to failover between them, and configure connection tracking to maintain state for 72-hour sessions.
Answer & Explanation
Correct Answer: 2 - Deploy an Application Load Balancer with HTTPS listeners and sticky sessions
Why this is correct: Application Load Balancer supports HTTPS listeners for TLS termination, meeting the encryption and compliance inspection requirements. ALB supports sticky sessions using load balancer-generated cookies, which maintain session affinity to ensure researchers stay connected to the same web server throughout their analysis job, preserving session state. ALB's connection idle timeout can be configured up to 4000 seconds (over 1 hour), and while this doesn't cover the full 72-hour potential job duration, HTTP-based applications typically implement keep-alive mechanisms or periodic client-side polling that reset the idle timer, preventing timeout on active sessions. For web application traffic using HTTP/HTTPS protocol, ALB is more cost-effective than NLB because ALB is optimized for HTTP workloads and charges based on LCUs (Load Balancer Capacity Units) that are typically lower for web traffic patterns. ALB's Layer 7 capabilities provide better value when HTTP protocol is being used.
Why the other options are wrong:
- Option 1: While Network Load Balancer supports TLS termination and long-lived TCP connections, connection draining timeout has a maximum value of 3600 seconds (1 hour), not 72 hours as suggested in this option. NLB's connection idle timeout is also limited to 4000 seconds, not 72 hours. More importantly, since the application explicitly uses a web interface over HTTPS, ALB is more cost-effective for HTTP/HTTPS workloads than NLB. NLB is typically more expensive for HTTP traffic patterns because LCU calculations for NLB are based on different metrics, and NLB doesn't provide the HTTP-specific optimizations and features that reduce costs for web applications. The static IP feature, while useful in some scenarios, doesn't provide value here and doesn't justify the increased cost.
- Option 3: Using Lambda functions as ALB targets for routing long-running connections is architecturally inappropriate and technically infeasible. Lambda functions have a maximum execution timeout of 15 minutes, making them incompatible with connections lasting 6-72 hours. Lambda is designed for short-lived, event-driven processing, not maintaining persistent connections to compute clusters. This architecture would require constant Lambda invocations and complex state management, dramatically increasing costs and complexity. Additionally, AWS WAF can be attached to ALB for security inspection without requiring Lambda in the traffic path, so this design introduces unnecessary components that increase cost without providing the required functionality.
- Option 4: Deploying two Network Load Balancers with Route 53 failover introduces unnecessary redundancy and cost. A single NLB already operates across multiple Availability Zones and provides high availability without requiring multiple load balancers. NLB automatically scales across AZs and provides failover capabilities natively. This design would double the fixed hourly cost of load balancing without providing additional availability benefits beyond what a single NLB already delivers. Furthermore, as explained in Option 1, NLB is not the most cost-effective choice for HTTPS web application traffic, and the "connection tracking to maintain state for 72-hour sessions" is misleading-connection tracking maintains flow state at the network level but doesn't extend idle timeout limits beyond the 4000-second maximum.
Key Insight: Cost optimization in load balancer selection depends heavily on the protocol and traffic pattern. For HTTP/HTTPS web applications, ALB is generally more cost-effective than NLB despite NLB's reputation for high performance, because ALB is specifically optimized for HTTP workloads. Understanding the actual timeout limits (connection draining max of 1 hour, idle timeout max of ~1 hour) versus what's needed (72-hour jobs) reveals that keep-alive traffic in HTTP sessions prevents idle timeout in practice.
Case Study 10
A financial technology startup is building a payment processing platform that must comply with PCI DSS requirements. The platform receives payment transaction requests from merchant applications over HTTPS and processes them through a series of microservices running in containers on Amazon ECS. The architecture requires that all incoming HTTPS traffic pass through a third-party PCI-certified intrusion detection and prevention system (IDS/IPS) running as virtual appliances on EC2 instances before reaching the payment processing microservices. The IDS/IPS appliances must inspect all traffic in both directions (request and response) to detect fraudulent patterns and must see the actual source IP addresses of merchant applications for fraud analysis. After inspection, traffic should be routed to an Application Load Balancer that distributes requests to the payment processing microservices based on the transaction type indicated in the URL path (/credit-card, /debit-card, /wire-transfer). The solution must support automatic scaling of both the IDS/IPS appliances and the payment processing microservices while maintaining high availability across multiple Availability Zones.
What combination of load balancing services should the solutions architect implement to meet these requirements?
- Deploy a Network Load Balancer to receive incoming HTTPS traffic, configure it to route traffic through the IDS/IPS appliances registered as targets, then configure the IDS/IPS appliances to forward inspected traffic to an Application Load Balancer for path-based routing to microservices.
- Deploy a Gateway Load Balancer with Gateway Load Balancer Endpoints to transparently insert the IDS/IPS appliances into the traffic path for bidirectional inspection, then route traffic to an Application Load Balancer configured with path-based routing rules for the payment processing microservices.
- Deploy an Application Load Balancer with Lambda functions as targets that route traffic to the IDS/IPS appliances for inspection, then invoke the appropriate payment processing microservice based on the Lambda function's inspection results and URL path analysis.
- Deploy AWS Transit Gateway to route traffic through a security VPC containing the IDS/IPS appliances, configure VPC peering between the security VPC and application VPC, then deploy an Application Load Balancer in the application VPC for path-based routing to microservices.
Answer & Explanation
Correct Answer: 2 - Deploy a Gateway Load Balancer with Gateway Load Balancer Endpoints and an Application Load Balancer
Why this is correct: Gateway Load Balancer is specifically designed for transparent insertion of third-party virtual network appliances like IDS/IPS systems into the traffic path. GWLB distributes traffic across multiple IDS/IPS appliance instances for high availability and automatic scaling, ensuring all traffic is inspected. GWLB uses GENEVE encapsulation to send traffic to appliances while preserving the original source IP addresses, which is critical for the fraud analysis requirement. Gateway Load Balancer Endpoints (GWLBe) act as VPC endpoints that intercept traffic and redirect it to the GWLB for inspection before allowing it to continue to the application. This provides bidirectional inspection-both request and response traffic flows through the appliances. After inspection, traffic continues to the Application Load Balancer, which provides Layer 7 path-based routing capabilities to route /credit-card to one target group, /debit-card to another, and /wire-transfer to a third. This architecture cleanly separates security inspection (GWLB) from application routing (ALB), each performing its specialized function.
Why the other options are wrong:
- Option 1: Using NLB to route traffic through IDS/IPS appliances registered as targets creates a non-transparent inspection model where the appliances must act as proxies, terminating and re-establishing connections. This breaks source IP preservation unless complex NAT configurations are implemented at the appliance level. More critically, this architecture doesn't provide true bidirectional inspection-the return traffic from microservices to merchants would bypass the IDS/IPS appliances since responses flow from ALB back to the original NLB, not through the appliances. The traffic flow becomes NLB → appliances → ALB → microservices, but responses follow ALB → NLB, missing inspection on the return path. This violates the bidirectional inspection requirement essential for PCI DSS compliance.
- Option 3: Lambda functions have a maximum execution timeout of 15 minutes and are designed for short-lived event processing, not maintaining persistent connections for payment processing workflows. Using Lambda to route to IDS/IPS appliances and then invoke microservices creates unnecessary complexity and latency for each transaction. Lambda cannot maintain the network-level transparency required for IDS/IPS inspection-the appliances would see Lambda as the source, not the original merchant applications, breaking the source IP preservation requirement. Additionally, Lambda execution costs for high-volume payment processing would be significantly higher than using purpose-built load balancing infrastructure, and the architecture introduces unreliable state management between inspection and processing phases.
- Option 4: While Transit Gateway can route traffic between VPCs and through a security VPC containing IDS/IPS appliances, this approach doesn't provide automatic load balancing and health checking across multiple IDS/IPS appliance instances. TGW routing would require complex route table configurations to force traffic through appliances, and there's no built-in mechanism to distribute load evenly across multiple appliance instances or automatically scale them. If a single appliance fails, TGW routing doesn't automatically redirect traffic to healthy appliances without manual intervention or complex scripting. This solution also introduces VPC peering complexity and doesn't address how to achieve high availability and automatic scaling of the inspection layer, which is a core requirement. Additionally, VPC peering is not necessary when using TGW, as TGW already provides VPC interconnection, showing architectural confusion in the option.
Key Insight: Gateway Load Balancer combined with Application Load Balancer represents a common pattern for architectures requiring both security appliance inspection and application-layer routing. GWLB handles transparent network appliance insertion while preserving source IPs and providing bidirectional inspection, and ALB handles HTTP-specific routing logic. This separation of concerns is architecturally clean and purpose-built for PCI DSS and similar compliance scenarios requiring appliance-based inspection.