Overview
This endpoint executes a workflow and streams the response in real-time as the model generates tokens. It works seamlessly with the Vercel AI SDK and is ideal for building interactive chat interfaces and real-time tutoring experiences.Endpoint
- HTTP Method: POST
-
URL:
/api/v1/run/{workflow_id}/stream?token=STREAM_TOKEN
Replace{workflow_id}with your workflow ID andSTREAM_TOKENwith a token from the Stream Token endpoint.
Features
Real-time Streaming
Receive and display progressive responses as the AI generates content — no waiting for the full result.
Single-Use Token Auth
Secure your requests with single-use tokens — no API key exposed on the client side.
Vercel AI SDK Compatible
Integrates directly with
ai/react hooks like useChat for reactive chat interfaces.Flexible Input Processing
Support for various input formats including text variables and image URLs.
Authentication
Uses a stream token passed as a query parameter (not a Bearer key). Get a token fromGET /api/v1/token first.
Request
Path Parameters| Name | Required | Type | Description |
|---|---|---|---|
| workflow_id | Yes | string | The ID of the workflow to run (e.g. wf_abc123) |
| Name | Required | Type | Description |
|---|---|---|---|
| token | Yes | string | A single-use stream token from /api/v1/token |
application/json
Pass your workflow’s input variables as key/value pairs.
Implementation Guide
Step 1: Generate Authentication Token
Step 1: Generate Authentication Token
First, obtain a single-use token from the token endpoint using your API key.Response:
Step 2: Make Streaming Request
Step 2: Make Streaming Request
Use the token to make a streaming request to the workflow endpoint:Use
--no-buffer so curl prints tokens as they arrive.Step 3: Implement React Component
Step 3: Implement React Component
Use this React component to handle streaming responses in your UI:
Step 4: Use with Vercel AI SDK
Step 4: Use with Vercel AI SDK
For chat-style interfaces, integrate with the Vercel AI SDK’s
useChat hook:Response
The response is a real-timetext/plain stream. Text chunks arrive as they are generated:
Error Responses
401 Unauthorized
401 Unauthorized
402 Payment Required
402 Payment Required
No credits remaining on the account.
404 Not Found
404 Not Found
Workflow not found or not published.
500 Internal Server Error
500 Internal Server Error
Something went wrong during execution.
Code Examples
Python
Python
Node.js
Node.js
cURL
cURL
Best Practices
Error Handling
Error Handling
Implement robust error handling for network issues, expired tokens, and invalid responses. Always check
response.ok before reading the stream.Token Management
Token Management
Tokens are single-use — generate a fresh token before each stream request. Never cache or reuse tokens.
Loading States
Loading States
Display appropriate loading indicators while waiting for the initial response chunk to arrive.
Performance Optimization
Performance Optimization
Optimize your application to handle continuous data streams efficiently. Use
ReadableStream readers and avoid buffering the entire response in memory.Additional Notes
- Single-Use Token: The stream token is consumed on the first request — if it’s expired or already used, you’ll get a 401.
- Max Duration: Streaming requests can run for up to 300 seconds before timing out.
- Credits: The workflow owner’s credits are charged based on total tokens generated.
-
Non-Streaming Alternative: Use the
POST /api/v1/run/{workflow_id}endpoint if you don’t need real-time output.
