Express in 2027
"The best API is the one that users don't feel is slow."
A principle that matters even more when you're streaming tokens from a language model.
In 2024, “AI-powered backend” meant calling an LLM and waiting. In 2027, it means streaming, real-time, and event-driven. Express — despite its age — is fully capable of handling all of it. Let me show you how, and then give you an honest look at whether it’s still the right tool for your stack.
Streaming Responses with Server-Sent Events (SSE)
Why SSE for AI?
event: message
data: {"content": "Hello"}
event: message
data: {"content": " world"}
event: done
data: [DONE]
Setting up SSE in Express
// routes/stream.js
app.get('/api/stream', (req, res) => {
// Required SSE headers
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.setHeader('X-Accel-Buffering', 'no'); // Disable nginx buffering
res.flushHeaders();
let count = 0;
const interval = setInterval(() => {
res.write(`data: ${JSON.stringify({ tick: count++ })}\n\n`);
if (count >= 5) {
res.write('data: [DONE]\n\n');
res.end();
clearInterval(interval);
}
}, 500);
req.on('close', () => clearInterval(interval));
});
Building an LLM-Powered API Endpoint
With the Anthropic SDK
// routes/ai.js
import express from 'express';
import Anthropic from '@anthropic-ai/sdk';
const router = express.Router();
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
router.post('/chat/stream', async (req, res) => {
const { message } = req.body;
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.flushHeaders();
try {
const stream = await anthropic.messages.stream({
model: 'claude-opus-4-6',
max_tokens: 1024,
messages: [{ role: 'user', content: message }],
});
for await (const event of stream) {
if (event.type === 'content_block_delta') {
res.write(`data: ${JSON.stringify({ text: event.delta.text })}\n\n`);
}
}
res.write('data: [DONE]\n\n');
res.end();
} catch (err) {
res.write(`data: ${JSON.stringify({ error: err.message })}\n\n`);
res.end();
}
});
export default router;
With the OpenAI SDK
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
router.post('/chat/stream', async (req, res) => {
const { messages } = req.body;
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.flushHeaders();
const stream = await openai.chat.completions.create({
model: 'gpt-4o',
messages,
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) {
res.write(`data: ${JSON.stringify({ content })}\n\n`);
}
}
res.write('data: [DONE]\n\n');
res.end();
});
WebSockets Alongside REST in the Same Express Server
import express from 'express';
import http from 'http';
import { WebSocketServer } from 'ws';
const app = express();
const server = http.createServer(app);
const wss = new WebSocketServer({ server }); // Share the same HTTP server
// REST routes still work
app.get('/api/health', (req, res) => res.json({ status: 'ok' }));
// WebSocket handler
wss.on('connection', (ws) => {
console.log('Client connected via WebSocket');
ws.on('message', async (msg) => {
const { type, payload } = JSON.parse(msg.toString());
if (type === 'ping') {
ws.send(JSON.stringify({ type: 'pong', payload }));
}
});
ws.on('close', () => console.log('Client disconnected'));
});
server.listen(3000);
Rule of thumb
Can Express Run on Cloudflare Workers (Edge)?
Short answer: Not natively. Cloudflare Workers use a V8 isolate runtime that doesn’t support Node.js APIs like `http`, `net`, or `process` in the traditional sense. Express’s router is built on top of these.
What you can do
- Use Hono — a lightweight router with an API inspired by Express, designed for edge runtimes (Cloudflare Workers, Bun, Deno)
- Use `@cloudflare/workers-types` and write a thin adapter to forward Worker requests into an Express-like handler
- Keep Express on traditional compute (Fly.io, Railway, AWS) and put Cloudflare's CDN + WAF in front of it
Hono vs Fastify vs Express in 2027
| Express 5 | Fastify | Hono | |
|---|---|---|---|
| Ecosystem maturity | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ |
| Raw performance | Good | Excellent (~2×) | Excellent (edge) |
| TypeScript support | Good | Excellent | Excellent |
| Learning curve | Near-zero | Low | Low |
| Edge runtime | No | No | Yes |
| Middleware ecosystem | Massive | Growing | Small but growing |
| Best for | Most APIs | High-throughput services | Edge / multi-runtime |
Should you migrate away from Express?
Migrate if
- You need true edge/serverless deployment (Cloudflare Workers, Deno Deploy) → Hono
- You're building a high-throughput service (>50k req/s) where framework overhead is measurable → Fastify
- You're starting a new greenfield project with a team comfortable with TypeScript → Fastify or Hono
Stay on Express if:
- Your team knows it well
- Your existing middleware is Express-specific
- You want zero onboarding friction for new developers
Explore project snapshots or discuss custom web solutions.
SSE is underrated: many developers jump straight to WebSockets, but SSE is easier to implement, debug, and scale when your use case is server-only streaming.
Thank You for Spending Your Valuable Time
I truly appreciate you taking the time to read blog. Your valuable time means a lot to me, and I hope you found the content insightful and engaging!
Frequently Asked Questions
Yes, but they're not designed for streaming. Use the native `EventSource` API for SSE, or the `fetch` streaming API (`response.body.getReader()`) for more control. Libraries like `@microsoft/fetch-event-source` give you SSE with POST support and auto-reconnect.
SSE is better for token streaming (server → client). WebSockets are better if you also need to send events mid-stream — like cancelling a generation or sending audio chunks back. Most AI chat interfaces use SSE.
Listen to `req.on('close', ...)` and clean up your streams/intervals. The browser's `EventSource` will auto-reconnect — you don't need to handle that on the server.
Never expose your API key via a browser-facing endpoint. Always proxy through your Express backend. Add rate limiting (e.g., `express-rate-limit`) and authentication (JWT/API key) to your `/api/chat/stream` route.
Absolutely. Just instantiate both clients and route to the appropriate one based on request params. Many production apps do this for model fallback or A/B testing.
Express 5 brought async error handling and dropped legacy callbacks. The ecosystem isn't going anywhere — it's too deeply embedded. But the interesting innovation is happening in Hono and Bun-native runtimes. The smart play is: understand Express deeply, keep an eye on Hono.
Comments are closed