Token Usage FAQ

Common questions about tokens, pricing, and usage tracking in Fusion AI.

Token Basics

What are tokens?

Tokens are the units that AI models use to process text. Generally, 1 token ≈ 4 characters in English, or about 3/4 of a word. For example, "Hello world!" is approximately 3 tokens.

Examples:
• "Hello" = 1 token
• "Hello world!" = 3 tokens
• "The quick brown fox" = 4 tokens

How are tokens counted?

Token usage includes both input (prompt) and output (completion) tokens:

Input tokens: Your prompt/messageCheaper rate
Output tokens: AI's responseHigher rate

Do different models have different token costs?

Yes! Token costs vary significantly between models:

ModelInput CostOutput Cost
Claude 3 Haiku$0.25/1M tokens$1.25/1M tokens
GPT-4 Turbo$10.00/1M tokens$30.00/1M tokens
Claude 3 Sonnet$3.00/1M tokens$15.00/1M tokens

NeuroSwitch™ & Token Optimization

How does NeuroSwitch™ help with token costs?

NeuroSwitch™ automatically routes requests to the most cost-effective model for each task:

  • • Simple questions → Cheaper models (Claude 3 Haiku)
  • • Complex reasoning → Premium models only when needed
  • • Automatic fallback if preferred model is unavailable
  • • Real-time cost optimization based on task complexity

Example: Cost savings with NeuroSwitch™

Scenario: 1000 mixed requests per day

Without NeuroSwitch™:

All requests to GPT-4: ~$45/day

With NeuroSwitch™:

Mixed routing: ~$18/day (60% savings)

Usage Tracking & Monitoring

How can I track my token usage?

Fusion AI provides detailed usage analytics:

  • • Real-time dashboard with current usage
  • • Daily, weekly, and monthly usage reports
  • • Cost breakdown by model and provider
  • • Usage alerts and budget limits
  • • Export usage data as CSV

Can I set usage limits?

Yes! You can set various limits:

Soft Limits (Alerts)

  • • Daily/monthly spend alerts
  • • Token usage warnings
  • • Email notifications

Hard Limits (Stop)

  • • Maximum daily spend
  • • Request rate limits
  • • Auto-pause on budget

Common Issues

Why is my token count higher than expected?

Several factors can increase token usage:

  • • System messages and conversation history count as tokens
  • • Special characters and formatting may increase token count
  • • Different languages have different token densities
  • • Model-specific tokenization can vary slightly
How can I reduce token usage?
  • • Use NeuroSwitch™ for automatic cost optimization
  • • Keep prompts concise and clear
  • • Use prompt caching for repeated requests
  • • Choose appropriate models for your use case
  • • Limit conversation history in chat applications
Do failed requests still consume tokens?

Generally no, but it depends on when the failure occurs:

  • • Authentication errors: No tokens consumed
  • • Rate limit errors: No tokens consumed
  • • Model errors after processing: Tokens may be consumed
  • • Partial responses: Only consumed tokens are charged

Learn More