Full-Stack Deployment Automation: Production Scripts & Real Failures - NextGenBeing Full-Stack Deployment Automation: Production Scripts & Real Failures - NextGenBeing
Back to discoveries

Automation Scripts for Deploying a Full-Stack Application: What We Learned Scaling to Production

Learn how we automated deployment for a full-stack app handling 2M+ requests/day. Real scripts, actual failures, and the gotchas that cost us 3 days of debugging.

Performance Premium Content 26 min read
NextGenBeing

NextGenBeing

May 5, 2026 1 views
Automation Scripts for Deploying a Full-Stack Application: What We Learned Scaling to Production
Photo by Đào Hiếu on Unsplash
Size:
Height:
📖 26 min read 📝 10,627 words 👁 Focus mode: ✨ Eye care:

Listen to Article

Loading...
0:00 / 0:00
0:00 0:00
Low High
0% 100%
⏸ Paused ▶️ Now playing... Ready to play ✓ Finished

Automation Scripts for Deploying a Full-Stack Application: What We Learned Scaling to Production

Last October, our team hit a wall. We were deploying our full-stack application manually—SSH into the server, pull the latest code, restart services, pray nothing breaks. It worked fine when we had 50k users. But when we crossed 500k active users and started pushing updates three times a day, the manual process became a nightmare.

One Friday afternoon deployment took down our API for 18 minutes. Our frontend was serving the new version, but the backend was still running old code. Database migrations ran halfway before timing out. Our monitoring went haywire. We lost about $12k in revenue during those 18 minutes, and our support team fielded 300+ angry tickets.

That weekend, I rebuilt our entire deployment pipeline from scratch. Not because I wanted to—because we had no choice. What I learned during those two sleepless nights fundamentally changed how I think about deployment automation. This isn't a theoretical guide. This is what actually works when you're deploying a React frontend, Node.js backend, PostgreSQL database, and Redis cache to production multiple times per day.

I'm going to share the exact scripts we use, the failures that taught us hard lessons, and the architectural decisions that saved us during our next major scaling event. If you're still deploying manually or your CI/CD pipeline feels fragile, this is the deep dive you need.

The Problem Nobody Talks About: Deployment Isn't Just "Push to Production"

When I first started researching deployment automation, every tutorial made it look simple. "Just use Docker!" they said. "Set up a CI/CD pipeline!" they said. What they didn't mention is that deployment automation for a real full-stack application involves coordinating:

  • Frontend builds that take 4-8 minutes and generate 50MB+ of static assets
  • Backend deployments that need zero-downtime with rolling updates
  • Database migrations that must run exactly once, in order, without locking tables
  • Cache invalidation across multiple Redis instances
  • Health checks that actually verify the system works, not just that processes are running
  • Rollback mechanisms when something inevitably goes wrong at 2 AM

Our first attempt at automation handled the happy path beautifully. But production isn't the happy path. Production is when your database migration fails halfway through because you hit the connection limit. Production is when your Docker image builds successfully but crashes on startup because an environment variable is missing. Production is when your health check passes but your WebSocket connections are silently failing.

Let me walk you through what we built, starting with the foundation and moving up to the orchestration layer that ties everything together.

Architecture Overview: What We're Actually Deploying

Before diving into scripts, here's our stack. Your stack might differ, but the principles apply universally:

Frontend:

  • React 18.x with TypeScript
  • Vite for building (replaced Webpack—build time dropped from 8 minutes to 90 seconds)
  • Deployed to Nginx serving static files
  • CloudFront CDN for global distribution

Backend:

  • Node.js 20.x with Express
  • TypeScript compiled to JavaScript
  • PM2 for process management (4 instances behind Nginx load balancer)
  • PostgreSQL 15 for primary database
  • Redis 7.x for caching and session storage

Infrastructure:

  • AWS EC2 instances (t3.large for backend, t3.medium for frontend)
  • RDS for PostgreSQL (db.r6g.xlarge—we learned the hard way that disk I/O matters)
  • ElastiCache for Redis (cache.r6g.large with cluster mode)
  • GitHub Actions for CI/CD
  • Docker for containerization (but we don't use Kubernetes—more on that later)

We deploy to three environments: development, staging, and production. Each environment is isolated with its own database, Redis instance, and EC2 instances. The deployment process must work identically across all three, which is harder than it sounds.

The Foundation: Docker Images That Actually Work in Production

Our first major mistake was treating Docker as just a packaging tool. We'd build images locally, push them to Docker Hub, and deploy. It worked—until it didn't. The images were bloated (1.2GB for a Node.js app), builds were slow (12+ minutes), and we had zero layer caching.

Here's the Dockerfile we ended up with after months of iteration. I'll explain each decision because the "why" matters more than the "what":

# Backend Dockerfile
FROM node:20-alpine AS builder

# Install build dependencies
RUN apk add --no-cache python3 make g++

WORKDIR /app

# Copy package files first (layer caching)
COPY package*.json ./
COPY tsconfig.json ./

# Install dependencies with frozen lockfile
RUN npm ci --only=production && \
    npm cache clean --force

# Copy source code
COPY src/ ./src/

# Build TypeScript
RUN npm run build

# Production stage
FROM node:20-alpine

# Add non-root user for security
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nodejs -u 1001

WORKDIR /app

# Copy built artifacts and dependencies
COPY --from=builder --chown=nodejs:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=nodejs:nodejs /app/dist ./dist
COPY --from=builder --chown=nodejs:nodejs /app/package*.json ./

# Switch to non-root user
USER nodejs

# Expose port
EXPOSE 3000

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3 \
    CMD node -e "require('http').get('http://localhost:3000/health', (r) => process.exit(r.statusCode === 200 ? 0 : 1))"

# Start application
CMD ["node", "dist/server.js"]

Why multi-stage builds? Our final image is 180MB instead of 1.2GB. The builder stage includes all the development dependencies and build tools, but the production stage only gets the compiled code and runtime dependencies. This reduced our image pull time from 3 minutes to 25 seconds.

Why npm ci instead of npm install? We learned this after a staging deployment mysteriously broke. Someone had updated a dependency locally without updating package-lock.json. npm ci uses the exact versions in the lockfile and fails if they're out of sync. It's also 2-3x faster.

Why a non-root user? Security best practice, but also required by our compliance team after a security audit. Running as root in containers is asking for trouble if someone finds an exploit.

Why that specific health check? The default Docker health check just pings the port. Ours actually makes an HTTP request to our /health endpoint, which verifies database connectivity, Redis connectivity, and checks that critical services are initialized. We discovered the importance of this when a container passed health checks but couldn't connect to the database due to a networking issue.

Here's the health check endpoint implementation:

// src/routes/health.ts
import { Request, Response } from 'express';
import { pool } from '../db/connection';
import { redisClient } from '../cache/redis';

export async function healthCheck(req: Request, res: Response) {
    const checks = {
        database: false,
        redis: false,
        uptime: process.uptime(),
        timestamp: new Date().toISOString()
    };

    try {
        // Check database with timeout
        const dbResult = await Promise.race([
            pool.query('SELECT 1'),
            new Promise((_, reject) => 
                setTimeout(() => reject(new Error('Database timeout')), 2000)
            )
        ]);
        checks.database = true;
    } catch (error) {
        console.error('Database health check failed:', error);
    }

    try {
        // Check Redis with timeout
        const redisResult = await Promise.race([
            redisClient.ping(),
            new Promise((_, reject) => 
                setTimeout(() => reject(new Error('Redis timeout')), 2000)
            )
        ]);
        checks.redis = redisResult === 'PONG';
    } catch (error) {
        console.error('Redis health check failed:', error);
    }

    // Return 200 only if all checks pass
    if (checks.database && checks.redis) {
        res.status(200).json({ status: 'healthy', checks });
    } else {
        res.status(503).json({ status: 'unhealthy', checks });
    }
}

The frontend Dockerfile is simpler but has its own gotchas:

# Frontend Dockerfile
FROM node:20-alpine AS builder

WORKDIR /app

COPY package*.json ./
RUN npm ci

COPY . .

# Build with production optimizations
ARG VITE_API_URL
ARG VITE_WS_URL
ENV VITE_API_URL=$VITE_API_URL
ENV VITE_WS_URL=$VITE_WS_URL

RUN npm run build

# Production stage with Nginx
FROM nginx:alpine

# Copy custom nginx config
COPY nginx.conf /etc/nginx/nginx.conf

# Copy built assets
COPY --from=builder /app/dist /usr/share/nginx/html

# Add health check script
RUN echo 'curl -f http://localhost/ || exit 1' > /health.sh && \
    chmod +x /health.sh

HEALTHCHECK --interval=30s --timeout=3s CMD /health.sh

EXPOSE 80

CMD ["nginx", "-g", "daemon off;"]

Why build args for API URLs? We used to hardcode these in the frontend code, which meant rebuilding for each environment. Now we pass them at build time. The Docker image is environment-agnostic until we inject the URLs.

Why a custom nginx.conf? The default Nginx configuration doesn't handle React Router properly. Any route that isn't / returns a 404. Here's our config:

# nginx.conf
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;

events {
    worker_connections 1024;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for"';

    access_log /var/log/nginx/access.log main;

    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    client_max_body_size 20M;

    # Gzip compression
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_types text/plain text/css text/xml text/javascript 
               application/x-javascript application/xml+rss 
               application/javascript application/json;

    server {
        listen 80;
        server_name _;
        root /usr/share/nginx/html;
        index index.html;

        # Security headers
        add_header X-Frame-Options "SAMEORIGIN" always;
        add_header X-Content-Type-Options "nosniff" always;
        add_header X-XSS-Protection "1; mode=block" always;

        # Cache static assets
        location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff|woff2|ttf|eot)$ {
            expires 1y;
            add_header Cache-Control "public, immutable";
        }

        # Handle React Router
        location / {
            try_files $uri $uri/ /index.html;
        }

        # Health check endpoint
        location /health {
            access_log off;
            return 200 "healthy\n";
            add_header Content-Type text/plain;
        }
    }
}

The try_files $uri $uri/ /index.html; line is critical. Without it, refreshing on /dashboard returns a 404 because Nginx looks for a file called dashboard that doesn't exist. This directive tells Nginx to serve index.html for any route, letting React Router handle the routing.

The Build Script: Making Docker Builds Fast and Reliable

Building Docker images in CI/CD is where things get slow if you're not careful. Our initial builds took 12-15 minutes. After optimization, we're down to 3-4 minutes. Here's the script:

#!/bin/bash
# scripts/build.sh

set -e  # Exit on any error
set -u  # Exit on undefined variable

# Configuration
DOCKER_REGISTRY="your-registry.azurecr.

Unlock Premium Content

You've read 30% of this article

What's in the full article

  • Complete step-by-step implementation guide
  • Working code examples you can copy-paste
  • Advanced techniques and pro tips
  • Common mistakes to avoid
  • Real-world examples and metrics

Join 10,000+ developers who love our premium content

Never Miss an Article

Get our best content delivered to your inbox weekly. No spam, unsubscribe anytime.

Comments (0)

Please log in to leave a comment.

Log In

Related Articles