Real-time AI output using Vercel AI SDK + Claude streaming API in a Next.js App Router project.
March 31, 2026

Waiting two seconds for an AI response to appear all at once feels slow. Watching it stream in word by word feels fast — even if the total time is identical. Streaming isn't just a UX improvement; it's the difference between an app that feels alive and one that feels broken.
This tutorial wires up Claude streaming in a Next.js App Router project using the Vercel AI SDK.
A Next.js route that streams Claude responses to the client as they're generated. The front-end renders tokens as they arrive with no full-page refresh, using React state and the Vercel AI SDK's useChat hook.
Install the dependencies:
pnpm add ai @anthropic-ai/sdk
Set your API key in .env.local:
ANTHROPIC_API_KEY=sk-ant-...
Create app/api/chat/route.ts. This is the streaming endpoint:
import Anthropic from '@anthropic-ai/sdk'
import { AnthropicStream, StreamingTextResponse } from 'ai'
const client = new Anthropic()
export async function POST(req: Request) {
const { messages } = await req.json()
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
stream: true,
messages,
})
const stream = AnthropicStream(response)
return new StreamingTextResponse(stream)
}
AnthropicStream adapts Claude's native stream format to the format the Vercel AI SDK expects. StreamingTextResponse wraps it in a proper HTTP streaming response with the right headers (Content-Type: text/plain; charset=utf-8, Transfer-Encoding: chunked).
'use client'
import { useChat } from 'ai/react'
export default function ChatPage() {
const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
api: '/api/chat',
})
return (
<div className="flex flex-col h-screen max-w-2xl mx-auto p-4">
<div className="flex-1 overflow-y-auto space-y-4 pb-4">
{messages.map(m => (
<div
key={m.id}
className={`p-3 rounded-lg ${
m.role === 'user'
? 'bg-blue-100 ml-8'
: 'bg-gray-100 mr-8'
}`}
>
<p className="text-sm font-semibold capitalize mb-1">{m.role}</p>
<p className="whitespace-pre-wrap">{m.content}</p>
</div>
))}
{isLoading && (
<div className="bg-gray-100 mr-8 p-3 rounded-lg">
<p className="text-sm font-semibold mb-1">assistant</p>
<p className="text-gray-400">Thinking…</p>
</div>
)}
</div>
<form onSubmit={handleSubmit} className="flex gap-2">
<input
value={input}
onChange={handleInputChange}
placeholder="Ask something…"
className="flex-1 border rounded-lg px-3 py-2 focus:outline-none focus:ring-2"
disabled={isLoading}
/>
<button
type="submit"
disabled={isLoading || !input.trim()}
className="bg-blue-500 text-white px-4 py-2 rounded-lg disabled:opacity-50"
>
Send
</button>
</form>
</div>
)
}
useChat handles the entire lifecycle: sending the request, accumulating streamed tokens into messages, and toggling isLoading. You get a fully functional streaming chat interface with about 50 lines of component code.
Not every streaming use case is a chat interface. For a one-shot generation (e.g., blog post generation, code explanation), use useCompletion instead:
'use client'
import { useCompletion } from 'ai/react'
export default function GeneratorPage() {
const { completion, input, handleInputChange, handleSubmit, isLoading } =
useCompletion({ api: '/api/generate' })
return (
<form onSubmit={handleSubmit} className="space-y-4">
<textarea
value={input}
onChange={handleInputChange}
placeholder="Describe the post you want to generate…"
className="w-full border rounded-lg p-3 h-32"
/>
<button type="submit" disabled={isLoading}>
{isLoading ? 'Generating…' : 'Generate'}
</button>
{completion && (
<div className="prose max-w-none mt-4 whitespace-pre-wrap">
{completion}
</div>
)}
</form>
)
}
The API route for this is identical in structure — just a different prompt.
Pass a system prompt from the API route side, not from the client. This keeps your instructions server-side and prevents prompt injection from the browser:
export async function POST(req: Request) {
const { messages } = await req.json()
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
stream: true,
system: 'You are a technical writing assistant. Be concise and precise.',
messages,
})
const stream = AnthropicStream(response)
return new StreamingTextResponse(stream)
}
AnthropicStream + StreamingTextResponse from the Vercel AI SDK handle the streaming plumbing; you just pass Claude's native streamuseChat gives you a complete chat interface with one hook; useCompletion covers single-turn generation