API Rate Limits
Understand usage limits, pricing tiers, and optimization strategies. Scale your AI applications with transparent and flexible rate limiting.
Types of Limits
Request Rate Limits
Maximum number of API requests per time period
Token Usage Limits
Maximum tokens consumed per time period
Concurrent Request Limits
Maximum simultaneous API connections
Pricing Tiers & Limits
Free Tier
$0/month
Rate Limits
Features
Pro Tier
$29/month
Rate Limits
Features
Enterprise
Custom pricing
Rate Limits
Features
Rate Limit Headers
Header | Description | Example |
---|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current time window | 1000 |
X-RateLimit-Remaining | Number of requests remaining in current window | 847 |
X-RateLimit-Reset | Unix timestamp when the rate limit resets | 1704063600 |
X-RateLimit-Retry-After | Seconds to wait before making another request | 60 |
Optimization Strategies
Request Optimization
Batching Requests
Combine multiple operations into single requests when possible to reduce API calls.
Caching Responses
Cache frequently requested data to avoid repeated API calls for the same content.
Async Processing
Use asynchronous requests and proper concurrency control to maximize throughput.
Token Optimization
Prompt Engineering
Write concise, effective prompts to minimize input token usage while maintaining quality.
Response Limits
Set appropriate max_tokens limits to control output length and costs.
Model Selection
Use NeuroSwitch or choose appropriate models based on task complexity vs cost.
Monitoring & Alerts
Usage Monitoring
Alert Configuration
Requesting Limit Increases
When to Request Increases
Consistent High Usage
Regularly hitting 80%+ of your current limits
Production Requirements
Deploying to production with higher expected traffic
Batch Processing
Large data processing jobs requiring burst capacity
Request Process
Submit Request
Contact support with usage details and requirements
Review Process
Our team reviews your usage patterns and business needs
Approval & Implementation
Approved increases are applied within 24-48 hours