Skip to main content
WriterzRoom does not use a single AI model for all tasks. Each agent in the pipeline is assigned a model based on the generation tier selected at request time. This tier-aware routing ensures that cost-sensitive agents (planner, researcher, formatter) use fast, affordable models while quality-critical agents (writer, editor) use more capable models.

Tier-to-Model Mapping

AgentQuickStandardPremium
PlannerClaude Haiku 4.5Claude Haiku 4.5Claude Haiku 4.5
Researcherβ€”Claude Haiku 4.5Claude Haiku 4.5
Call Writerβ€”Claude Haiku 4.5Claude Haiku 4.5
WriterClaude Haiku 4.5Claude Sonnet 4Claude Opus 4
Editorβ€”Claude Sonnet 4Claude Sonnet 4
FormatterClaude Haiku 4.5Claude Haiku 4.5Claude Haiku 4.5
SEOβ€”Claude Haiku 4.5Claude Haiku 4.5
Publisherβ€”Claude Haiku 4.5Claude Haiku 4.5
The writer is the only agent that changes model across all three tiers. The editor uses Sonnet 4 on both Standard and Premium because its job is quality enforcement, not creative generation β€” Sonnet is sufficient for editing regardless of tier.

How Routing Works

When a generation request arrives, the API resolves the tier from the generation_mode field:
  1. The generate endpoint reads req.generation_mode (values: quick, standard, premium)
  2. It calls get_tier_model(tier, agent_role) to resolve the model string for the writer
  3. The resolved model name is stored in state.writing_context["_tier_model"]
  4. Each graph node reads _tier_model from the writing context and passes it to the agent
This means model selection happens once at request time and is deterministic for the entire pipeline. There is no runtime negotiation or model switching mid-generation.

Circuit Breaker Failover

WriterzRoom maintains circuit breakers for three providers: Anthropic, OpenAI, and Tavily. Each circuit breaker tracks success and failure rates independently and can be in one of three states: closed (healthy), half-open (testing), or open (blocking requests). When a provider’s circuit breaker opens, agents that depend on it attempt failover to the alternate provider:
  • Writer (primary: Anthropic Claude) β†’ failover to OpenAI GPT-4o
  • Planner (primary: Anthropic Claude Haiku) β†’ failover to OpenAI GPT-4o
  • Researcher (primary: Tavily) β†’ no failover (fails fast)
Failover is automatic and transparent to the user. The circuit breaker records success/failure for each attempt and recovers automatically after a cooldown period. If both Anthropic and OpenAI circuit breakers are open simultaneously, the generation fails immediately with a clear error message rather than queuing or retrying indefinitely.

Token Budget Management

Each agent operates within token budget constraints determined by the tier:
  • Quick: 8,000 max completion tokens
  • Standard: 16,000 max completion tokens
  • Premium: 24,000 max completion tokens
The writer enforces a minimum token allocation based on the template’s word count requirements. If the template requires 4,000 words minimum (e.g., research papers), the writer calculates min_tokens = min_words Γ— 1.3 and overrides the user’s max_tokens setting if it falls below this threshold.

API Model Override

For API users on Professional and Enterprise plans, the generation_settings object in the request body accepts a model field. This is informational only β€” the actual model used is always determined by the tier. The API does not allow arbitrary model selection to prevent cost abuse and ensure consistent quality guarantees.
Last modified on March 27, 2026