Pricing & Credits

Understand Halfred's transparent prepaid credit system, profile pricing, cost calculation, and optimization strategies.

How It Works

1. Add Credits

Purchase credits in advance through your dashboard:

  • $10 = 10 credits

  • $50 = 50 credits

  • $100 = 100 credits

  • Custom amounts available

2. Credits Are Consumed

Each API request consumes credits based on:

  • Model profile used (lite, standard, or deepthink)

  • Number of tokens processed (input + output)

  • Current pricing for that profile

3. Monitor Usage

Track your credit balance and usage in real-time:

  • Dashboard shows current balance

  • Each API response includes token usage

  • Historical usage reports available

  • Set up low-balance alerts

Profile Pricing

Credits are consumed based on token usage and the profile you choose:

Profile
Price per Million Output Tokens
Price per Million Input Tokens

LITE

$0.50

$0.05

STANDARD

$2.50

$0.25

DEEPTHINK

$12.50

$1.25

DEV

$0.00 (Free)

$0.00 (Free)

For the most up-to-date pricing, please refer to the Profile Dashboard (requires login).

Input token pricing is 10% of output token pricing

Calculating Costs

Formula

Cost = (Input Tokens × Input Price) + (Output Tokens × Output Price)

Input Tokens & Thinking Models

Some advanced models use reasoning tokens (also called "thinking tokens") during their internal processing. These tokens are counted as input tokens and billed accordingly.

What are reasoning tokens?

  • Internal "thinking" the model does before generating the final response

  • Used by models like OpenAI's gpt-5 series for complex reasoning tasks

  • Not visible in the final output but improve response quality

Cost impact:

  • Reasoning tokens are added to your input token count

  • More complex questions may generate more reasoning tokens

  • Primarily affects the DEEPTHINK profile, which uses advanced reasoning models

Example:

User prompt: "Solve this complex math problem..."
- Visible input tokens: 10
- Reasoning tokens (internal): 500
- Total input tokens billed: 510
- Output tokens: 100

Cost with DEEPTHINK profile:
- Input: 510 × $1.25/M = $0.0006375
- Output: 100 × $12.50/M = $0.00125
- Total: ~$0.00189 (1.89 credits)

💡 Tip: Check the usage field in API responses to see the exact breakdown of prompt tokens (input + reasoning) and completion tokens (output).

Examples

Example 1: Simple Question (Lite Profile)

Request: "What is 2+2?"
- Input: ~10 tokens
- Output: ~5 tokens

Cost calculation:
- Input: 10 tokens × $0.05 / 1M = $0.0000005
- Output: 5 tokens × $0.50 / 1M = $0.0000025
Total: ~$0.000003 (0.003 credits)

Example 2: Content Generation (Standard Profile)

Request: "Write a 500-word blog post about coffee"
- Input: ~20 tokens
- Output: ~750 tokens

Cost calculation:
- Input: 20 tokens × $0.25 / 1M = $0.000005
- Output: 750 tokens × $2.50 / 1M = $0.001875
Total: ~$0.00188 (1.88 credits)

Example 3: Document Analysis (DeepThink Profile)

Request: Analyze a 10-page legal document
- Input: ~5,000 tokens
- Output: ~1,000 tokens

Cost calculation:
- Input: 5,000 tokens × $1.25 / 1M = $0.00625
- Output: 1,000 tokens × $12.50 / 1M = $0.0125
Total: ~$0.01875 (18.75 credits)

Token-to-Credit Conversion

1 credit = $1 USD

This makes cost calculation straightforward:

  • $10 = 10 credits

  • $100 = 100 credits

  • $1,000 = 1,000 credits

Cost Optimization Strategies

1. Choose the Right Profile

Don't pay for more than you need:

// ❌ Expensive for simple tasks
await client.chat.completions.create({
  model: "deepthink",
  messages: [{ role: "user", content: "Hello!" }],
});

// ✅ Cost-effective
await client.chat.completions.create({
  model: "lite",
  messages: [{ role: "user", content: "Hello!" }],
});

2. Reduce Input Tokens

Shorter prompts = lower costs:

// ❌ Verbose
const prompt = "I would like you to please help me by generating a summary of the following text that I am about to provide to you...";

// ✅ Concise
const prompt = "Summarize this text:";

3. Limit Output Tokens

Control response length:

await client.chat.completions.create({
  model: "standard",
  messages: [...],
  max_tokens: 100  // Limit response to 100 tokens
});

4. Cache Common Responses

Store frequently requested responses:

const cache = new Map();

async function getCachedCompletion(prompt) {
  if (cache.has(prompt)) {
    return cache.get(prompt); // No API call = no cost
  }

  const response = await client.chat.completions.create({
    model: "standard",
    messages: [{ role: "user", content: prompt }],
  });

  cache.set(prompt, response);
  return response;
}

5. Use DEV Profile for Testing

Don't consume credits during development:

// In development
const model = process.env.NODE_ENV === 'production' ? 'standard' : 'dev';

await client.chat.completions.create({
  model: model,
  messages: [...]
});

6. Batch Similar Requests

Process multiple items in one request:

// ❌ Multiple expensive requests
for (const item of items) {
  await client.chat.completions.create({
    model: "standard",
    messages: [{ role: "user", content: `Process: ${item}` }],
  });
}

// ✅ Single batched request
await client.chat.completions.create({
  model: "standard",
  messages: [
    {
      role: "user",
      content: `Process these items: ${items.join(", ")}`,
    },
  ],
});

Monitoring Usage

In Your Dashboard

  • Current Balance: See credits remaining

  • Usage History: Track daily/weekly/monthly consumption

  • Alerts: Get notified when balance is low

In API Responses

Every response includes usage information:

{
  "usage": {
    "prompt_tokens": 50,
    "completion_tokens": 100,
    "total_tokens": 150
  },
  "profile": "standard",
  "model": "gpt-5-mini"
}

Programmatic Monitoring

const completion = await client.chat.completions.create({
  model: "standard",
  messages: [...]
});

const cost = (
  completion.usage.prompt_tokens * 0.75 / 1_000_000 +
  completion.usage.completion_tokens * 2.50 / 1_000_000
);

console.log(`Request cost: $${cost.toFixed(6)}`);
console.log(`Tokens used: ${completion.usage.total_tokens}`);

Billing & Invoices

Prepaid System

  • No subscriptions: Pay only for what you use

  • No auto-renewal: Credits don't expire but must be added manually

  • Full control: You can't accidentally overspend

Adding Credits

  1. Log in to your dashboard at halfred.ai

  2. Navigate to Credits

  3. Click "Top up Credits"

  4. Choose an amount or enter a custom value (minimum $5)

  5. Complete the secure payment via Stripe

Credit Balance & Limits

What Happens When Credits Run Out?

When your balance reaches zero:

  • API requests will return a 429 Insufficient Credits error

  • No requests will be processed until credits are added

  • No charges or overdrafts - you stay in control

Frequently Asked Questions

Do credits expire?

No, credits never expire. Use them at your own pace.

Can I get a refund?

Yes, unused credits can be refunded within 30 days of purchase.

What happens if I run out mid-request?

Requests are checked before processing. You won't be charged if you have insufficient credits.

Can I set spending limits?

Yes, you control your balance. We never charge beyond your prepaid amount.

Are there hidden fees?

No. The only cost is the credits you purchase and consume.

How does pricing compare to using providers directly?

Halfred adds a small markup for routing intelligence, unified billing, and multi-provider access. For most users, the convenience and time savings outweigh the minimal additional cost.

Can I share credits across multiple projects?

Yes. Credits are tied to your account. Currently, Halfred operates with one project per account, one API key, and one shared credit pool.

Do you offer free trials?

Yes! Use the DEV profile for free testing. No credit card required for signup.

Support

Questions about pricing or billing?

Next Steps

Last updated