Scaling Laravel to 100M Requests/Day on AWS: Production Guide

Last year, our team hit a wall at 5 million requests per day. Our Laravel application was running on a single EC2 instance with a modest RDS PostgreSQL database, and everything seemed fine—until it wasn't. One morning at 6:47 AM, I got the dreaded PagerDuty alert: response times had spiked to 8+ seconds, and our error rate was climbing past 15%. Users were complaining, our CEO was in my Slack DMs, and I was frantically SSHing into our production server in my pajamas.

Fast forward eighteen months, and we're now handling over 100 million requests per day with an average response time of 180ms and a 99.9% uptime. We didn't get there by following some magical tutorial or implementing a single silver bullet solution. We got there through systematic optimization, painful debugging sessions, architectural rewrites, and learning from our mistakes—lots of mistakes.

This isn't a theoretical guide. This is what actually worked for us, what failed spectacularly, and what I'd do differently if I had to start over. I'm going to show you the exact architecture we built, the specific AWS services we used, the database optimization techniques that made the biggest impact, and the caching strategies that saved us thousands of dollars per month. More importantly, I'll share the gotchas that nearly took us down and the monitoring setup that keeps us sleeping at night.

The Breaking Point: When Our Architecture Failed

Before I dive into solutions, let me paint a picture of where we started. Our initial architecture was what I call "optimistic simplicity"—a single t3.large EC2 instance running Nginx and PHP-FPM, connected to a db.t3.medium RDS PostgreSQL instance. We had Laravel's built-in file cache enabled, no queue workers, and we were processing everything synchronously. For our first year with under 1 million daily requests, this setup cost us about $280/month and worked perfectly fine.

The cracks started showing around 3 million requests per day. Our database CPU would spike to 80% during peak hours. Our application server would occasionally hit 90% memory usage. But we kept throwing band-aids at it—increasing PHP-FPM workers from 20 to 40, bumping the EC2 instance to t3.xlarge, upgrading the RDS instance to db.t3.large. Each fix bought us a few more weeks of runway.

Then we launched a major feature that went semi-viral on Product Hunt. Within 48 hours, our traffic jumped from 3M to 8M requests per day. Our single-server architecture completely collapsed. Here's what the actual error logs looked like:

[2024-03-15 06:47:23] production.ERROR: SQLSTATE[08006] [7] 
FATAL: sorry, too many clients already
[2024-03-15 06:47:24] production.ERROR: SQLSTATE[08006] [7] 
could not connect to server: Connection refused
[2024-03-15 06:47:25] production.ERROR: Maximum execution time of 30 seconds exceeded

Our database connection pool was maxed out at 100 connections (the default for db.t3.large), and every new request was getting rejected. PHP-FPM processes were hanging waiting for database connections that would never come. The application was essentially dead in the water.

I spent that entire weekend rebuilding our infrastructure. My coworker Sarah and I were on a Zoom call from Saturday morning until Sunday night, deploying changes, monitoring metrics, rolling back when things broke, and slowly piecing together a scalable architecture. Here's what we learned.

The Architecture That Got Us to 100M Requests

Our current architecture is built on AWS and looks nothing like where we started. Here's the high-level overview:

Application Layer:

Auto Scaling Group with 8-12 c6i.2xlarge EC2 instances (8 vCPU, 16GB RAM each)
Application Load Balancer distributing traffic
Laravel Octane with Swoole (persistent application state)
PHP 8.3 with OPcache and JIT enabled

Database Layer:

Aurora PostgreSQL Serverless v2 (2-16 ACUs, auto-scaling)
3 read replicas for query distribution
PgBouncer connection pooler (transaction mode)
RDS Proxy for connection management

Caching Layer:

ElastiCache Redis cluster (3 nodes, r6g.xlarge)
CloudFront CDN for static assets and API responses
Application-level caching with Laravel's Redis driver
Database query result caching with 5-minute TTL

Queue System:

SQS for job queuing
4 dedicated c6i.xlarge workers running Laravel Horizon
Separate queues for critical, default, and low-priority jobs

Monitoring & Logging:

CloudWatch for metrics and alarms
X-Ray for distributed tracing
Laravel Telescope in production (with heavy sampling)
Custom Datadog dashboards for business metrics

The monthly cost for this infrastructure runs about $4,800, which breaks down to roughly $0.0048 per 1,000 requests. When we were struggling at 5M requests on our old architecture, we were paying $280/month ($0.056 per 1,000 requests). Scaling up actually reduced our per-request costs by 91% while improving performance dramatically.

Let me break down each layer and show you exactly how we implemented it, including the configuration files, the gotchas we hit, and the performance improvements we measured.

Database Optimization: From 5s Queries to 50ms

Our database was the first bottleneck we hit, and it remained our biggest challenge throughout the entire scaling process. The problem wasn't just the database server specs—it was how we were using it. Let me show you the specific optimizations that made the biggest impact.

Connection Pooling: The Game Changer

The single biggest improvement came from implementing proper connection pooling. Laravel's default database configuration creates a new connection for every request, which is fine at low scale but disastrous at high scale. Here's why: PostgreSQL connection overhead is around 1-2ms per connection, plus you're limited by max_connections (100 by default on smaller RDS instances).

At 5M requests per day with an average response time of 500ms, you need about 29 concurrent connections to handle the load (5M requests / 86400 seconds * 0.5s response time). Sounds manageable, right? But traffic isn't evenly distributed. During our peak hour (usually 2-3 PM EST), we see 15% of daily traffic, which means we need 104 concurrent connections just for that hour. We were constantly hitting the connection limit.

We implemented PgBouncer in transaction mode, which pools connections at the transaction level. Here's our actual configuration:

[databases]
laravel = host=our-aurora-cluster.cluster-xxx.us-east-1.rds.amazonaws.com port=5432 dbname=production

[pgbouncer]
pool_mode = transaction
max_client_conn = 10000
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3
max_db_connections = 50
max_user_connections = 50

The key insight here is pool_mode = transaction. This allows PgBouncer to reuse database connections between transactions, not just between requests. With this configuration, we can handle 10,000 concurrent application connections with only 50 actual database connections.

In Laravel, we updated our database configuration to point to PgBouncer instead of directly to RDS:

// config/database.php
'pgsql' => [
    'driver' => 'pgsql',
    'host' => env('DB_HOST', 'pgbouncer-internal-lb-xxx.us-east-1.elb.amazonaws.com'),
    'port' => env('DB_PORT', '6432'), // PgBouncer port
    'database' => env('DB_DATABASE', 'laravel'),
    'username' => env('DB_USERNAME', 'forge'),
    'password' => env('DB_PASSWORD', ''),
    'charset' => 'utf8',
    'prefix' => '',
    'prefix_indexes' => true,
    'search_path' => 'public',
    'sslmode' => 'prefer',
    'options' => [
        PDO::ATTR_TIMEOUT => 5, // Connection timeout
        PDO::ATTR_PERSISTENT => false, // Don't use persistent connections with PgBouncer
    ],
],

⚠️ Critical gotcha: Don't use Laravel's persistent connections (PDO::ATTR_PERSISTENT => true) when you're using PgBouncer. It creates a connection leak because Laravel will hold onto connections that PgBouncer thinks are available for reuse. We discovered this after our application slowly leaked connections over 48 hours until PgBouncer hit its limit and started rejecting new connections. Took us 6 hours of debugging to figure out the root cause.

After implementing PgBouncer, our database connection errors dropped from 300+ per hour to zero. Database CPU utilization during peak hours dropped from 85% to 42%. This single change bought us the headroom to scale from 5M to 15M requests per day.

Query Optimization and Indexing

Connection pooling solved our connection limit problem, but we still had slow queries. I ran a query analysis on our production database and found some horrifying statistics:

SELECT query, calls, mean_exec_time, max_exec_time 
FROM pg_stat_statements 
WHERE mean_exec_time > 100 
ORDER BY mean_exec_time DESC 
LIMIT 10;

Output:

query                                                          | calls  | mean_exec_time | max_exec_time
---------------------------------------------------------------+--------+----------------+--------------
SELECT * FROM users WHERE email = $1                           | 847392 | 847.32         | 4821.44
SELECT * FROM orders WHERE user_id = $1 ORDER BY created_at... | 423847 | 523.18         | 3244.92
SELECT * FROM products WHERE category_id = $1 AND active = ... | 328471 | 412.73         | 2893.11

Our most common query—looking up users by email—was taking an average of 847ms. That's insane for a simple lookup. The problem? No index on the email column. I know, I know—how did we ship to production without an email index? In our defense, we had an index on id and assumed email lookups would be rare. We were wrong.

Here are the indexes we added that made the biggest impact:

// database/migrations/2024_03_16_add_critical_indexes.php
public function up()
{
    Schema::table('users', function (Blueprint $table) {
        $table->index('email'); // Reduced lookup from 847ms to 12ms
        $table->index(['status', 'created_at']); // Composite index for filtered queries
    });

    Schema::table('orders', function (Blueprint $table) {
        $table->index('user_id'); // Reduced from 523ms to 8ms
        $table->index(['user_id', 'status', 'created_at']); // Covering index
        $table->index('created_at'); // For date range queries
    });

    Schema::table('products', function (Blueprint $table) {
        $table->index(['category_id', 'active']); // Composite for filtered queries
        $table->index('sku'); // Unique lookups
    });

    // Partial index for active products only (reduces index size by 60%)
    DB::statement('CREATE INDEX idx_products_active ON products (category_id) WHERE active = true');
}

The partial index on products was particularly clever. We have 2.3M products in our database, but only 900K are active at any given time. By creating a partial index that only includes active products, we reduced the index size from 180MB to 72MB, which improved query performance and reduced memory pressure on the database server.

After adding these indexes, our P95 query time dropped from 2.4 seconds to 180ms. Our database CPU utilization dropped another 15 percentage points. We were finally able to handle 25M requests per day without the database breaking a sweat.

Read Replicas and Query Distribution

Even with optimized queries and proper indexing, we were still seeing occasional CPU spikes on our primary database instance during traffic surges. The solution was read replicas, but implementing them correctly in Laravel requires some thought.

We set up 3 Aurora read replicas and configured Laravel to use them for read queries:

// config/database.php
'pgsql' => [
    'read' => [
        'host' => [
            env('DB_READ_HOST_1', 'aurora-replica-1.xxx.us-east-1.rds.amazonaws.com'),
            env('DB_READ_HOST_2', 'aurora-replica-2.xxx.us-east-1.rds.amazonaws.com'),
            env('DB_READ_HOST_3', 'aurora-replica-3.xxx.us-east-1.rds.amazonaws.

Unlock Premium Content

You've read 30% of this article

What's in the full article

Complete step-by-step implementation guide
Working code examples you can copy-paste
Advanced techniques and pro tips
Common mistakes to avoid
Real-world examples and metrics

Don't have an account? Start your free trial

Join 10,000+ developers who love our premium content

Articles

Tutorials

Bloggers

Scaling to 100M Requests/Day with Laravel and AWS: A Production Journey

Listen to Article

The Breaking Point: When Our Architecture Failed

The Architecture That Got Us to 100M Requests

Database Optimization: From 5s Queries to 50ms

Connection Pooling: The Game Changer

Query Optimization and Indexing

Read Replicas and Query Distribution

Unlock Premium Content

What's in the full article

Never Miss an Article

Comments (0)

Related Articles

Building a Modern SaaS Application with Laravel - Part 3: Advanced Features & Configuration

Building a Modern SaaS Application with Laravel - Part 1: Architecture, Setup & Foundations

Optimizing Database Performance with Indexing and Caching: What We Learned Scaling to 100M Queries/Day

Articles

Tutorials

Bloggers

Scaling to 100M Requests/Day with Laravel and AWS: A Production Journey

Listen to Article

The Breaking Point: When Our Architecture Failed

The Architecture That Got Us to 100M Requests

Database Optimization: From 5s Queries to 50ms

Connection Pooling: The Game Changer

Query Optimization and Indexing

Read Replicas and Query Distribution

Unlock Premium Content

What's in the full article

Never Miss an Article

Comments (0)

Related Articles

Building a Modern SaaS Application with Laravel - Part 3: Advanced Features & Configuration

Building a Modern SaaS Application with Laravel - Part 1: Architecture, Setup & Foundations

Optimizing Database Performance with Indexing and Caching: What We Learned Scaling to 100M Queries/Day

Cookie & Ad Consent