Pricing & Credits
Understand Halfred's transparent prepaid credit system, profile pricing, cost calculation, and optimization strategies.
How It Works
1. Add Credits
Purchase credits in advance through your dashboard:
$10 = 10 credits
$50 = 50 credits
$100 = 100 credits
Custom amounts available
2. Credits Are Consumed
Each API request consumes credits based on:
Model profile used (lite, standard, or deepthink)
Number of tokens processed (input + output)
Current pricing for that profile
3. Monitor Usage
Track your credit balance and usage in real-time:
Dashboard shows current balance
Each API response includes token usage
Historical usage reports available
Set up low-balance alerts
Profile Pricing
Credits are consumed based on token usage and the profile you choose:
LITE
$0.50
$0.05
STANDARD
$2.50
$0.25
DEEPTHINK
$12.50
$1.25
DEV
$0.00 (Free)
$0.00 (Free)
For the most up-to-date pricing, please refer to the Profile Dashboard (requires login).
Input token pricing is 10% of output token pricing
Calculating Costs
Formula
Cost = (Input Tokens × Input Price) + (Output Tokens × Output Price)Input Tokens & Thinking Models
Some advanced models use reasoning tokens (also called "thinking tokens") during their internal processing. These tokens are counted as input tokens and billed accordingly.
What are reasoning tokens?
Internal "thinking" the model does before generating the final response
Used by models like OpenAI's gpt-5 series for complex reasoning tasks
Not visible in the final output but improve response quality
Cost impact:
Reasoning tokens are added to your input token count
More complex questions may generate more reasoning tokens
Primarily affects the DEEPTHINK profile, which uses advanced reasoning models
Example:
User prompt: "Solve this complex math problem..."
- Visible input tokens: 10
- Reasoning tokens (internal): 500
- Total input tokens billed: 510
- Output tokens: 100
Cost with DEEPTHINK profile:
- Input: 510 × $1.25/M = $0.0006375
- Output: 100 × $12.50/M = $0.00125
- Total: ~$0.00189 (1.89 credits)💡 Tip: Check the usage field in API responses to see the exact breakdown of prompt tokens (input + reasoning) and completion tokens (output).
Examples
Example 1: Simple Question (Lite Profile)
Request: "What is 2+2?"
- Input: ~10 tokens
- Output: ~5 tokens
Cost calculation:
- Input: 10 tokens × $0.05 / 1M = $0.0000005
- Output: 5 tokens × $0.50 / 1M = $0.0000025
Total: ~$0.000003 (0.003 credits)Example 2: Content Generation (Standard Profile)
Request: "Write a 500-word blog post about coffee"
- Input: ~20 tokens
- Output: ~750 tokens
Cost calculation:
- Input: 20 tokens × $0.25 / 1M = $0.000005
- Output: 750 tokens × $2.50 / 1M = $0.001875
Total: ~$0.00188 (1.88 credits)Example 3: Document Analysis (DeepThink Profile)
Request: Analyze a 10-page legal document
- Input: ~5,000 tokens
- Output: ~1,000 tokens
Cost calculation:
- Input: 5,000 tokens × $1.25 / 1M = $0.00625
- Output: 1,000 tokens × $12.50 / 1M = $0.0125
Total: ~$0.01875 (18.75 credits)Token-to-Credit Conversion
1 credit = $1 USD
This makes cost calculation straightforward:
$10 = 10 credits
$100 = 100 credits
$1,000 = 1,000 credits
Cost Optimization Strategies
1. Choose the Right Profile
Don't pay for more than you need:
// ❌ Expensive for simple tasks
await client.chat.completions.create({
model: "deepthink",
messages: [{ role: "user", content: "Hello!" }],
});
// ✅ Cost-effective
await client.chat.completions.create({
model: "lite",
messages: [{ role: "user", content: "Hello!" }],
});2. Reduce Input Tokens
Shorter prompts = lower costs:
// ❌ Verbose
const prompt = "I would like you to please help me by generating a summary of the following text that I am about to provide to you...";
// ✅ Concise
const prompt = "Summarize this text:";3. Limit Output Tokens
Control response length:
await client.chat.completions.create({
model: "standard",
messages: [...],
max_tokens: 100 // Limit response to 100 tokens
});4. Cache Common Responses
Store frequently requested responses:
const cache = new Map();
async function getCachedCompletion(prompt) {
if (cache.has(prompt)) {
return cache.get(prompt); // No API call = no cost
}
const response = await client.chat.completions.create({
model: "standard",
messages: [{ role: "user", content: prompt }],
});
cache.set(prompt, response);
return response;
}5. Use DEV Profile for Testing
Don't consume credits during development:
// In development
const model = process.env.NODE_ENV === 'production' ? 'standard' : 'dev';
await client.chat.completions.create({
model: model,
messages: [...]
});6. Batch Similar Requests
Process multiple items in one request:
// ❌ Multiple expensive requests
for (const item of items) {
await client.chat.completions.create({
model: "standard",
messages: [{ role: "user", content: `Process: ${item}` }],
});
}
// ✅ Single batched request
await client.chat.completions.create({
model: "standard",
messages: [
{
role: "user",
content: `Process these items: ${items.join(", ")}`,
},
],
});Monitoring Usage
In Your Dashboard
Current Balance: See credits remaining
Usage History: Track daily/weekly/monthly consumption
Alerts: Get notified when balance is low
In API Responses
Every response includes usage information:
{
"usage": {
"prompt_tokens": 50,
"completion_tokens": 100,
"total_tokens": 150
},
"profile": "standard",
"model": "gpt-5-mini"
}Programmatic Monitoring
const completion = await client.chat.completions.create({
model: "standard",
messages: [...]
});
const cost = (
completion.usage.prompt_tokens * 0.75 / 1_000_000 +
completion.usage.completion_tokens * 2.50 / 1_000_000
);
console.log(`Request cost: $${cost.toFixed(6)}`);
console.log(`Tokens used: ${completion.usage.total_tokens}`);Billing & Invoices
Prepaid System
No subscriptions: Pay only for what you use
No auto-renewal: Credits don't expire but must be added manually
Full control: You can't accidentally overspend
Adding Credits
Log in to your dashboard at halfred.ai
Navigate to Credits
Click "Top up Credits"
Choose an amount or enter a custom value (minimum $5)
Complete the secure payment via Stripe
Credit Balance & Limits
What Happens When Credits Run Out?
When your balance reaches zero:
API requests will return a
429 Insufficient CreditserrorNo requests will be processed until credits are added
No charges or overdrafts - you stay in control
Frequently Asked Questions
Do credits expire?
No, credits never expire. Use them at your own pace.
Can I get a refund?
Yes, unused credits can be refunded within 30 days of purchase.
What happens if I run out mid-request?
Requests are checked before processing. You won't be charged if you have insufficient credits.
Can I set spending limits?
Yes, you control your balance. We never charge beyond your prepaid amount.
Are there hidden fees?
No. The only cost is the credits you purchase and consume.
How does pricing compare to using providers directly?
Halfred adds a small markup for routing intelligence, unified billing, and multi-provider access. For most users, the convenience and time savings outweigh the minimal additional cost.
Can I share credits across multiple projects?
Yes. Credits are tied to your account. Currently, Halfred operates with one project per account, one API key, and one shared credit pool.
Do you offer free trials?
Yes! Use the DEV profile for free testing. No credit card required for signup.
Support
Questions about pricing or billing?
Email: [email protected]
Dashboard: halfred.ai/billing
Next Steps
Learn about Model Profiles
Understand Tokens
Optimize with Best Practices
Start building with our Quick Start Guide
Last updated