# Pricing & Credits

## How It Works

### 1. Add Credits

Purchase credits in advance through your dashboard:

* **$10** = 10 credits
* **$50** = 50 credits
* **$100** = 100 credits
* **Custom** amounts available

### 2. Credits Are Consumed

Each API request consumes credits based on:

* **Model profile** used (lite, standard, or deepthink)
* **Number of tokens** processed (input + output)
* **Current pricing** for that profile

### 3. Monitor Usage

Track your credit balance and usage in real-time:

* Dashboard shows current balance
* Each API response includes token usage
* Historical usage reports available
* Set up low-balance alerts

## Profile Pricing

Credits are consumed based on token usage and the profile you choose:

| Profile       | Price per Million Output Tokens | Price per Million Input Tokens |
| ------------- | ------------------------------- | ------------------------------ |
| **LITE**      | $0.50                           | $0.05                          |
| **STANDARD**  | $2.50                           | $0.25                          |
| **DEEPTHINK** | $12.50                          | $1.25                          |
| **DEV**       | **$0.00** (Free)                | **$0.00** (Free)               |

*For the most up-to-date pricing, please refer to the* [*Profile Dashboard*](https://halfred.ai/app/project-profile) *(requires login).*

*Input token pricing is 10% of output token pricing*

## Calculating Costs

### Formula

```
Cost = (Input Tokens × Input Price) + (Output Tokens × Output Price)
```

### Input Tokens & Thinking Models

Some advanced models use **reasoning tokens** (also called "thinking tokens") during their internal processing. These tokens are counted as **input tokens** and billed accordingly.

**What are reasoning tokens?**

* Internal "thinking" the model does before generating the final response
* Used by models like OpenAI's gpt-5 series for complex reasoning tasks
* Not visible in the final output but improve response quality

**Cost impact:**

* Reasoning tokens are added to your input token count
* More complex questions may generate more reasoning tokens
* Primarily affects the **DEEPTHINK** profile, which uses advanced reasoning models

**Example:**

```
User prompt: "Solve this complex math problem..."
- Visible input tokens: 10
- Reasoning tokens (internal): 500
- Total input tokens billed: 510
- Output tokens: 100

Cost with DEEPTHINK profile:
- Input: 510 × $1.25/M = $0.0006375
- Output: 100 × $12.50/M = $0.00125
- Total: ~$0.00189 (1.89 credits)
```

💡 **Tip**: Check the `usage` field in API responses to see the exact breakdown of prompt tokens (input + reasoning) and completion tokens (output).

### Examples

#### Example 1: Simple Question (Lite Profile)

```javascript
Request: "What is 2+2?"
- Input: ~10 tokens
- Output: ~5 tokens

Cost calculation:
- Input: 10 tokens × $0.05 / 1M = $0.0000005
- Output: 5 tokens × $0.50 / 1M = $0.0000025
Total: ~$0.000003 (0.003 credits)
```

#### Example 2: Content Generation (Standard Profile)

```javascript
Request: "Write a 500-word blog post about coffee"
- Input: ~20 tokens
- Output: ~750 tokens

Cost calculation:
- Input: 20 tokens × $0.25 / 1M = $0.000005
- Output: 750 tokens × $2.50 / 1M = $0.001875
Total: ~$0.00188 (1.88 credits)
```

#### Example 3: Document Analysis (DeepThink Profile)

```javascript
Request: Analyze a 10-page legal document
- Input: ~5,000 tokens
- Output: ~1,000 tokens

Cost calculation:
- Input: 5,000 tokens × $1.25 / 1M = $0.00625
- Output: 1,000 tokens × $12.50 / 1M = $0.0125
Total: ~$0.01875 (18.75 credits)
```

## Token-to-Credit Conversion

**1 credit = $1 USD**

This makes cost calculation straightforward:

* $10 = 10 credits
* $100 = 100 credits
* $1,000 = 1,000 credits

## Cost Optimization Strategies

### 1. Choose the Right Profile

Don't pay for more than you need:

```javascript
// ❌ Expensive for simple tasks
await client.chat.completions.create({
  model: "deepthink",
  messages: [{ role: "user", content: "Hello!" }],
});

// ✅ Cost-effective
await client.chat.completions.create({
  model: "lite",
  messages: [{ role: "user", content: "Hello!" }],
});
```

### 2. Reduce Input Tokens

Shorter prompts = lower costs:

```javascript
// ❌ Verbose
const prompt = "I would like you to please help me by generating a summary of the following text that I am about to provide to you...";

// ✅ Concise
const prompt = "Summarize this text:";
```

### 3. Limit Output Tokens

Control response length:

```javascript
await client.chat.completions.create({
  model: "standard",
  messages: [...],
  max_tokens: 100  // Limit response to 100 tokens
});
```

### 4. Cache Common Responses

Store frequently requested responses:

```javascript
const cache = new Map();

async function getCachedCompletion(prompt) {
  if (cache.has(prompt)) {
    return cache.get(prompt); // No API call = no cost
  }

  const response = await client.chat.completions.create({
    model: "standard",
    messages: [{ role: "user", content: prompt }],
  });

  cache.set(prompt, response);
  return response;
}
```

### 5. Use DEV Profile for Testing

Don't consume credits during development:

```javascript
// In development
const model = process.env.NODE_ENV === 'production' ? 'standard' : 'dev';

await client.chat.completions.create({
  model: model,
  messages: [...]
});
```

### 6. Batch Similar Requests

Process multiple items in one request:

```javascript
// ❌ Multiple expensive requests
for (const item of items) {
  await client.chat.completions.create({
    model: "standard",
    messages: [{ role: "user", content: `Process: ${item}` }],
  });
}

// ✅ Single batched request
await client.chat.completions.create({
  model: "standard",
  messages: [
    {
      role: "user",
      content: `Process these items: ${items.join(", ")}`,
    },
  ],
});
```

## Monitoring Usage

### In Your Dashboard

* **Current Balance**: See credits remaining
* **Usage History**: Track daily/weekly/monthly consumption
* **Alerts**: Get notified when balance is low

### In API Responses

Every response includes usage information:

```json
{
  "usage": {
    "prompt_tokens": 50,
    "completion_tokens": 100,
    "total_tokens": 150
  },
  "profile": "standard",
  "model": "gpt-5-mini"
}
```

### Programmatic Monitoring

```javascript
const completion = await client.chat.completions.create({
  model: "standard",
  messages: [...]
});

const cost = (
  completion.usage.prompt_tokens * 0.75 / 1_000_000 +
  completion.usage.completion_tokens * 2.50 / 1_000_000
);

console.log(`Request cost: $${cost.toFixed(6)}`);
console.log(`Tokens used: ${completion.usage.total_tokens}`);
```

## Billing & Invoices

### Prepaid System

* **No subscriptions**: Pay only for what you use
* **No auto-renewal**: Credits don't expire but must be added manually
* **Full control**: You can't accidentally overspend

### Adding Credits

1. Log in to your dashboard at [halfred.ai](https://halfred.ai)
2. Navigate to **Credits**
3. Click **"Top up Credits"**
4. Choose an amount or enter a custom value (minimum $5)
5. Complete the secure payment via Stripe

## Credit Balance & Limits

### What Happens When Credits Run Out?

When your balance reaches zero:

* API requests will return a `429 Insufficient Credits` error
* No requests will be processed until credits are added
* No charges or overdrafts - you stay in control

## Frequently Asked Questions

### Do credits expire?

No, credits never expire. Use them at your own pace.

### Can I get a refund?

Yes, unused credits can be refunded within 30 days of purchase.

### What happens if I run out mid-request?

Requests are checked before processing. You won't be charged if you have insufficient credits.

### Can I set spending limits?

Yes, you control your balance. We never charge beyond your prepaid amount.

### Are there hidden fees?

No. The only cost is the credits you purchase and consume.

### How does pricing compare to using providers directly?

Halfred adds a small markup for routing intelligence, unified billing, and multi-provider access. For most users, the convenience and time savings outweigh the minimal additional cost.

### Can I share credits across multiple projects?

Yes. Credits are tied to your account. Currently, Halfred operates with one project per account, one API key, and one shared credit pool.

### Do you offer free trials?

Yes! Use the **DEV** profile for free testing. No credit card required for signup.

## Support

Questions about pricing or billing?

* **Email**: [billing@halfred.ai](mailto:contact@halfred.ai)
* **Dashboard**: [halfred.ai/billing](https://halfred.ai)

## Next Steps

* Learn about [Model Profiles](/03-concepts/01-profiles.md)
* Understand [Tokens](/03-concepts/03-tokens.md)
* Optimize with [Best Practices](/05-advanced/01-best-practices.md)
* Start building with our [Quick Start Guide](/01-getting-started/01-quickstart.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.halfred.ai/03-concepts/02-pricing.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
