Overview
AgenticPencil enforces rate limits to ensure fair usage and optimal performance for all users. Rate limits are applied per API key and reset every minute.Rate Limits by Plan
Free Plan
10 requests per minutePerfect for development and small-scale testing
Pro Plan
60 requests per minuteIdeal for production applications with moderate traffic
Scale Plan
120 requests per minuteBuilt for high-volume applications and intensive workflows
Enterprise Plan
300 requests per minuteCustom limits available for enterprise needs
How Rate Limits Work
Sliding Window Algorithm
Sliding Window Algorithm
AgenticPencil uses a sliding window approach for rate limiting:
- Your rate limit counter tracks requests made in the past 60 seconds
- As time passes, older requests “fall off” the window
- This provides smoother request distribution compared to fixed windows
Per API Key Enforcement
Per API Key Enforcement
- Each API key has its own independent rate limit
- Multiple API keys on the same account share the same per-key limits
- Team members with separate API keys don’t affect each other’s limits
Reset Behavior
Reset Behavior
- Rate limits continuously reset as the sliding window moves
- No specific “reset time” - it’s constantly updating
- If you hit your limit, you can make requests again as soon as older requests age out
Rate Limit Headers
Every API response includes rate limit information in the headers:Response Headers
Your current rate limit (requests per minute)
Number of requests remaining in the current window
Unix timestamp when the oldest request in your window will age out
Number of requests used in the current window
Rate Limit Exceeded Response
When you exceed your rate limit, you’ll receive a429 Too Many Requests response:
Rate Limit Error
Seconds to wait before making another request
Best Practices
1. Monitor Rate Limit Headers
Always check the rate limit headers in your responses:2. Implement Exponential Backoff
When you hit rate limits, use exponential backoff to retry requests:3. Batch and Queue Requests
For high-volume applications, implement request queuing:Python Request Queue
4. Use Multiple API Keys
For maximum throughput, create multiple API keys:Load Distribution
Distribute requests across multiple API keys to multiply your effective rate limit
Fault Tolerance
If one key gets rate limited, others can continue processing
Team Separation
Give different team members or services their own keys
Environment Isolation
Use separate keys for development, staging, and production
Rate Limit Optimization Strategies
Batch Similar Requests
Batch Similar Requests
Instead of making multiple keyword research requests with low limits, make fewer requests with higher limits:Less Efficient:
- 10 requests with limit=10 each = 10 API calls
- 1 request with limit=100 = 1 API call
Cache Results
Cache Results
Store API responses locally to avoid repeated requests for the same data:
- Cache keyword research results for 24-48 hours
- Cache content audits for 7-14 days
- Cache usage data for 1 hour
Request Prioritization
Request Prioritization
Prioritize critical requests during high-traffic periods:
- Real-time user requests get priority
- Background analytics can be delayed
- Batch processing during off-peak hours
Precompute Data
Precompute Data
For predictable use cases, precompute and store results:
- Daily content audits during low-traffic hours
- Weekly competitive analysis batches
- Monthly comprehensive keyword research