Streaming Responses

Shadai streams all responses in real-time, delivering tokens as they're generated for the best user experience.

Why Streaming?

Without Streaming:

User asks question → Wait 30 seconds → Get complete answer

With Streaming:

User asks question → Immediate first token → Tokens flow → Natural experience

Benefits:

⚡ Instant feedback - Users see responses immediately

🎯 Better UX - No long waits

🔄 Interruptible - Can stop if not useful

📊 Progress indication - Users know something is happening

Basic Streaming

All Shadai tools support streaming:

Key points:

async for - Iterate over chunks

chunk - Individual text piece (could be word, sentence, or token)

end="" - No newline between chunks

flush=True - Display immediately

Streaming All Tools

Query

Summarize

Web Search

Engine

Agent

Collecting Full Response

Method 1: Accumulate Chunks

Method 2: List Join

Practical Patterns

Pattern 1: Terminal Output

Pattern 2: File Writing

Pattern 3: Web Server (FastAPI)

Pattern 4: WebSocket

Pattern 5: Progress Indication

Error Handling

Basic Error Handling

Mid-Stream Errors

Advanced Techniques

Chunk Processing

Rate Limiting

Buffered Streaming

Multi-Stream Aggregation

Performance Optimization

Fast Display

Memory Efficiency

UI Integration

React (Frontend)

Streamlit

Best Practices

✅ Do This

❌ Don't Do This

Troubleshooting

No Output Appearing

Delayed Output

Next Steps

Error Handling - Robust error management

Web Server Integration - Production APIs

Performance Optimization - Scale efficiently

Pro Tip: Always use flush=True for real-time display in terminals!

Why Streaming?#

Basic Streaming#

Streaming All Tools#

Query#

Summarize#

Web Search#

Engine#

Agent#

Collecting Full Response#

Method 1: Accumulate Chunks#

Method 2: List Join#

Practical Patterns#

Pattern 1: Terminal Output#

Pattern 2: File Writing#

Pattern 3: Web Server (FastAPI)#

Pattern 4: WebSocket#

Pattern 5: Progress Indication#

Error Handling#

Basic Error Handling#

Mid-Stream Errors#

Advanced Techniques#

Chunk Processing#

Rate Limiting#

Buffered Streaming#

Multi-Stream Aggregation#

Performance Optimization#

Fast Display#

Memory Efficiency#

UI Integration#

React (Frontend)#

Streamlit#

Best Practices#

✅ Do This#

❌ Don't Do This#

Troubleshooting#

No Output Appearing#

Delayed Output#

Next Steps#