From Working to Fast
- Profiling — find where your time actually goes before optimizing blindly
- Caching — in-memory, Redis, and HTTP cache headers
- Worker threads — offloading CPU-heavy work off the event loop
- Clustering with PM2 — using all your CPU cores, not just one
- Rate limiting — sliding window by user, IP, and endpoint
Profile First
The cardinal rule of performance: measure before you optimize. Optimizing the wrong thing wastes time and introduces complexity for zero gain.
Node.js Built-In Profiler
# Start your app with the profiler active
node --inspect src/index.js
# Or with tsx for TypeScript
tsx --inspect src/index.ts
Then open Chrome and navigate to `chrome://inspect`. Click “Inspect” on your running process to open DevTools with CPU profiling and memory snapshots.
clinic.js — The Power Tool
For deeper analysis, clinic.js gives you flame graphs, event loop analysis, and heap allocation tracking with a single command.
npm install --save-dev clinic autocannon
# Profile for 30 seconds under load
npx clinic doctor -- node dist/index.js
# Then in another terminal, hit it with load:
npx autocannon -c 100 -d 30 http://localhost:3000/api/users
- Event loop delay (indicates blocking operations)
- CPU usage per function call
- Memory growth over time
What to Look For
- Event loop delay >10ms: You have blocking synchronous operations
- Memory that grows and doesn't shrink: Memory leak
- Hot functions in flame graph: Functions where CPU time concentrates — these are your optimization targets
Caching
The cardinal rule of performance: measure before you optimize. Optimizing the wrong thing wastes time and introduces complexity for zero gain.
In-Memory Cache (Node-Cache)
// npm install node-cache
// src/config/cache.ts
import NodeCache from 'node-cache';
export const cache = new NodeCache({
stdTTL: 300, // Default 5 minutes TTL
checkperiod: 60, // Check for expired keys every 60 seconds
useClones: false // Better performance for read-only data
});
```
```typescript
// Cache middleware — wraps any route with caching
import { cache } from '../config/cache.js';
import { Request, Response, NextFunction } from 'express';
export function cacheMiddleware(ttlSeconds: number) {
return (req: Request, res: Response, next: NextFunction) => {
const key = `${req.method}:${req.originalUrl}`;
const cached = cache.get(key);
if (cached) {
return res.json(cached);
}
// Override res.json to capture and cache the response
const originalJson = res.json.bind(res);
res.json = (data: any) => {
if (res.statusCode === 200) {
cache.set(key, data, ttlSeconds);
}
return originalJson(data);
};
next();
};
}
// Apply to specific routes
router.get('/products', cacheMiddleware(300), getProducts); // Cache 5 min
router.get('/categories', cacheMiddleware(3600), getCategories); // Cache 1 hour
Redis Cache — Shared Across Instances
// npm install ioredis
// src/config/redis.ts
import Redis from 'ioredis';
export const redis = new Redis(process.env.REDIS_URL!, {
maxRetriesPerRequest: 3,
enableReadyCheck: true,
lazyConnect: true,
});
redis.on('error', (err) => {
console.error('Redis error:', err);
});
// Redis cache helper
export async function getCachedOrFetch<T>(
key: string,
fetcher: () => Promise<T>,
ttlSeconds = 300
): Promise<T> {
const cached = await redis.get(key);
if (cached) {
return JSON.parse(cached) as T;
}
const data = await fetcher();
await redis.setex(key, ttlSeconds, JSON.stringify(data));
// In your service
export async function getProducts() {
return getCachedOrFetch(
'products:all',
() => db.product.findMany({ where: { active: true } }),
300
);
}
HTTP Cache Headers — Let the CDN Do the Work
// Cache public product list for 5 minutes, allow stale for 1 more minute
router.get('/products', (req, res, next) => {
res.set('Cache-Control', 'public, max-age=300, stale-while-revalidate=60');
next();
}, getProducts);
// Never cache user-specific or sensitive endpoints
router.get('/profile', requireAuth, (req, res, next) => {
res.set('Cache-Control', 'private, no-cache');
next();
}, getProfile);
Worker Threads
Node.js runs on a single thread. If you perform a CPU-intensive operation (image processing, PDF generation, complex calculations) on the main thread, every other request waits. Worker threads let you offload CPU-heavy work to a separate thread while the event loop continues serving requests.
The Problem (Event Loop Blocking)
// This blocks the event loop for the entire calculation duration
app.get('/fibonacci/:n', (req, res) => {
const n = parseInt(req.params.n);
const result = fibonacci(n); // If n=45, this takes ~10 seconds — blocks everything
res.json({ result });
});
The Solution (Worker Thread)
// src/workers/fibonacci.worker.ts
import { workerData, parentPort } from 'worker_threads';
function fibonacci(n: number): number {
if (n <= 1) return n;
return fibonacci(n - 1) + fibonacci(n - 2);
}
const result = fibonacci(workerData.n);
// src/utils/runWorker.ts
import { Worker } from 'worker_threads';
import path from 'path';
export function runWorker(workerFile: string, data: unknown): Promise<unknown> {
return new Promise((resolve, reject) => {
const worker = new Worker(path.resolve(workerFile), { workerData: data });
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', (code) => {
if (code !== 0) reject(new Error(`Worker exited with code ${code}`));
});
});
// Event loop stays free while worker runs
app.get('/fibonacci/:n', async (req, res) => {
const result = await runWorker('./dist/workers/fibonacci.worker.js', {
n: parseInt(req.params.n)
});
res.json({ result });
});
Use worker threads for: Image/video processing, PDF generation, CSV parsing of large files, complex mathematical calculations, data encryption/decryption at scale. Don’t use worker threads for: I/O operations (database queries, HTTP requests) — those are already non-blocking via the event loop.
Clustering With PM2
"Node.js operates on a single-threaded event loop, which can limit CPU utilization on multi-core systems. To leverage all available CPU cores, use the cluster module or process managers like PM2."
GitHub Community — Node.js Performance Discussion
PM2 Ecosystem Config
// npm install --save pm2
// ecosystem.config.cjs
module.exports = {
apps: [{
name: 'my-express-api',
script: './dist/index.js',
instances: 'max', // One process per CPU core
exec_mode: 'cluster', // Enable cluster mode
watch: false, // Don't watch files in production
max_memory_restart: '500M', // Restart if memory exceeds 500MB
env_production: {
NODE_ENV: 'production',
PORT: 3000
}
}]
};
Deploy
# Build TypeScript first
npm run build
# Start in cluster mode
pm2 start ecosystem.config.cjs --env production
# Check status
pm2 status
# Monitor in real-time
pm2 monit
# Scale up/down without restart
pm2 scale my-express-api 4
# Enable auto-restart on server reboot
pm2 startup
pm2 save
Important
Manual Clustering
// src/cluster.ts
import cluster from 'cluster';
import os from 'os';
import { logger } from './config/logger.js';
const numCPUs = os.cpus().length;
if (cluster.isPrimary) {
logger.info(`Primary process ${process.pid} starting ${numCPUs} workers`);
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', (worker, code, signal) => {
logger.warn(`Worker ${worker.process.pid} died (${signal || code}). Restarting...`);
cluster.fork(); // Always restart a dead worker
});
} else {
// Worker: run the actual Express app
import('./index.js');
logger.info(`Worker ${process.pid} started`);
}
Rate Limiting
// npm install express-rate-limit rate-limit-redis ioredis
// src/middleware/rateLimiter.ts
import rateLimit from 'express-rate-limit';
import { RedisStore } from 'rate-limit-redis';
import { redis } from '../config/redis.js';
import { Request } from 'express';
// General API rate limiter
export const apiLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 200,
standardHeaders: 'draft-7', // Return RateLimit headers
legacyHeaders: false,
store: new RedisStore({ sendCommand: (...args: string[]) => redis.call(...args) }),
keyGenerator: (req: Request) => {
// Use user ID if authenticated, otherwise fall back to IP
return (req as any).user?.userId ?? req.ip ?? 'anonymous';
},
message: {
type: 'https://api.yourapp.com/errors/rate-limited',
title: 'Too Many Requests',
status: 429,
detail: 'You have exceeded the rate limit. Please try again later.',
}
});
// Strict limiter for auth endpoints (prevent brute-force)
export const authLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 10, // Only 10 attempts per 15 minutes
store: new RedisStore({ sendCommand: (...args: string[]) => redis.call(...args) }),
keyGenerator: (req: Request) => `auth:${req.ip}`,
message: {
type: 'https://api.yourapp.com/errors/rate-limited',
title: 'Too Many Login Attempts',
status: 429,
detail: 'Too many login attempts. Please wait 15 minutes.',
}
});
// Per-endpoint limiter for expensive operations
export const exportLimiter = rateLimit({
windowMs: 60 * 60 * 1000, // 1 hour
max: 5, // Only 5 exports per hour
store: new RedisStore({ sendCommand: (...args: string[]) => redis.call(...args) }),
keyGenerator: (req: Request) => `export:${(req as any).user?.userId}`,
// Apply in your routes:
import { apiLimiter, authLimiter, exportLimiter } from '../middleware/rateLimiter.js';
// Global API limit
app.use('/api', apiLimiter);
// Strict limit on auth routes
app.use('/api/auth', authLimiter);
// Expensive endpoint-specific limit
router.post('/reports/export', requireAuth, exportLimiter, generateExport);
Production Performance Checklist
- [ ] Profile first — use `clinic.js` to identify actual bottlenecks
- [ ] Cache hot data — Redis for shared state, in-memory for single-instance
- [ ] Add HTTP cache headers for public endpoints
- [ ] Offload CPU work to worker threads (image processing, PDF generation)
- [ ] Enable PM2 clustering with `instances: 'max'`
- [ ] Enable compression (`npm install compression` — reduces response size 70-80%)
- [ ] Rate limit by user, IP, and endpoint with Redis-backed sliding window
- [ ] Use connection pooling for all database connections
- [ ] Never block the event loop with synchronous operations
Explore project snapshots or discuss custom web solutions.
Premature optimization is the root of all evil. But failure to optimize at all is the root of a different kind of evil — unavailability.
Thank You for Spending Your Valuable Time
I truly appreciate you taking the time to read blog. Your valuable time means a lot to me, and I hope you found the content insightful and engaging!
Frequently Asked Questions
Yes — in fact, Redis is *required* for shared state in a clustered environment. Each PM2 worker has its own memory, so sessions and cache must live in Redis to be accessible across workers.
Use worker threads for CPU-intensive computation within the same Node.js process — they share memory and are lighter weight. Use child processes (`child_process.spawn`) for running external programs or scripts outside of Node.js.
A fixed window resets at exact intervals (e.g., every 15 minutes). This allows bursting at the window boundary — a user could send 200 requests at 14:59 and another 200 at 15:01. A sliding window tracks the last N requests regardless of when the window started, eliminating this burst problem.
Each worker is a separate Node.js process with its own V8 heap. A typical Express API uses 50-200MB per worker. Set `max_memory_restart` in your PM2 config to automatically restart workers that exceed your limit and prevent memory leak accumulation.
No — each PM2 worker has its own in-memory cache instance. A cache set in worker A is invisible to worker B. For clustered apps, always use Redis for shared caching.
Comments are closed