Halfred API Reference

Complete REST API reference for Halfred including chat completions, models endpoint, request parameters, and response formats.

Base URL

https://api.halfred.ai/v1/

Authentication

All API requests require authentication using an API key. Include your API key in the Authorization header:

Authorization: Bearer halfred_xxxxxxxxxxxxxxxxxxxxxxxxxxxx

Getting an API Key

First Time:

Sign up for a Halfred account at halfred.ai using Google, GitHub, or email
Upon first login, your API key is automatically generated and displayed
Copy and store it securely - you'll need it for all API requests

Regenerating Your Key: If you need to regenerate your API key:

Go to your Dashboard
Navigate to the "Your Key" section
Click the three dots menu (⋮) and select "Revoke Key & Get New One"
Copy your new API key immediately

⚠️ Important: Revoking a key is immediate. Update all applications using the old key to avoid service interruption.

API Key Security

Never expose keys: Keep your API keys secure and never expose them in client-side code or public repositories
Server-side only: API keys should only be used in server-side applications
Monitor usage: Track your API key usage and request logs in the dashboard
Revoke if compromised: Immediately regenerate your key if you suspect it has been compromised

Models

Halfred offers different model profiles optimized for various use cases:

Available Profiles

Profile

Description

LITE

Lightweight and fast AI for everyday simple tasks

STANDARD

Balanced performance - let Halfred select the best value model

DEEPTHINK

Advanced reasoning with extended context

DEV

Free calls for development and testing

📖 Learn more: Model Profiles Guide - Detailed information on pricing, context sizes, use cases, and how to choose the right profile.

Endpoints

POST /chat/completions

Generate AI completions for chat conversations.

Request

POST https://api.halfred.ai/v1/chat/completions
Authorization: Bearer halfred_xxxxxxxxxxxxxxxxxxxxxxxxxxxx
Content-Type: application/json

Request Body

Conversation Messages

{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ],
  "model": "standard",
  "temperature": 0.7,
  "response_format": {
    "type": "json_object"
  },
  "max_tokens": 150
}

Request Parameters

Parameter

Type

Required

Description

messages

array

Yes

Array of conversation messages

messages[].role

string

Yes

Message role: "system", "user", or "assistant"

messages[].content

string

Yes

Message content

model

string

Model profile to use (defaults to project default or "lite")

temperature

number

Sampling temperature between 0 and 2 (default varies by model)

response_format

object

Response format specification

response_format.type

string

Format type (e.g., "json_object")

stream

boolean

If true, sends partial message deltas (not yet supported)

max_completion_tokens

integer

Upper bound for completion tokens (including reasoning tokens)

max_tokens

integer

Maximum number of tokens to generate in the chat completion

Message Roles

system: Sets the behavior and context for the assistant
user: Messages from the user/human
assistant: Previous responses from the AI (for conversation history)

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677652288,
  "provider": "openai",
  "model": "gpt-4o",
  "profile": "standard",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 8,
    "total_tokens": 20
  }
}

Response Fields

Field

Type

Description

id

string

Unique identifier for the completion

object

string

Always "chat.completion"

created

number

Unix timestamp of when the completion was created

provider

string

The underlying AI provider used

model

string

The specific model that generated the response

profile

string

The Halfred profile used

choices

array

Array of completion choices (typically one)

choices[].index

number

Choice index (usually 0)

choices[].message

object

The generated message

choices[].message.role

string

Always "assistant"

choices[].message.content

string

The generated response content

choices[].finish_reason

string

Reason completion finished: "stop", "length", "content_filter", or "tool_calls"

usage

object

Token usage information

usage.prompt_tokens

number

Number of tokens in the prompt

usage.completion_tokens

number

Number of tokens in the completion

usage.total_tokens

number

Total tokens used (prompt + completion)

GET /models

Retrieve a list of available models and their details.

Request

GET https://api.halfred.ai/v1/models
Authorization: Bearer halfred_xxxxxxxxxxxxxxxxxxxxxxxxxxxx

Response

{
  "object": "list",
  "data": [
    {
      "id": "halfred-standard",
      "object": "model",
      "created": "2024-01-15T10:30:00.000Z",
      "owned_by": "halfred"
    },
    {
      "id": "standard",
      "object": "model",
      "created": "2024-01-15T10:30:00.000Z",
      "owned_by": "halfred"
    },
    {
      "id": "halfred-deepthink",
      "object": "model",
      "created": "2024-01-15T10:30:00.000Z",
      "owned_by": "halfred"
    },
    {
      "id": "deepthink",
      "object": "model",
      "created": "2024-01-15T10:30:00.000Z",
      "owned_by": "halfred"
    },
    (...)
  ]
}

Response Fields

Field

Type

Description

object

string

Always "list"

data

array

Array of available models

data[].id

string

Model identifier (can be used in chat completions)

data[].object

string

Always "model"

data[].created

string

ISO 8601 timestamp when the model was created

data[].owned_by

string

Always "halfred"

Model ID Formats

Models are available in two formats:

halfred-{profile} (e.g., halfred-standard)
{profile} (e.g., standard)

Both formats are equivalent and can be used interchangeably.

Error Handling

The API uses standard HTTP status codes and returns detailed error information in JSON format.

Error Response Format

{
  "error": {
    "message": "Invalid API key",
    "type": "authentication_error",
    "param": null,
    "code": "invalid_api_key"
  }
}

Common Error Codes

HTTP Status

Error Type

Description

400

invalid_request

Bad request - check your parameters

401

authentication_error

Invalid or missing API key

403

permission_error

API key doesn't have required permissions

404

not_found

Resource not found

429

rate_limit_error

Too many requests or insufficient credits

500

server_error

Internal server error

Specific Error Scenarios

Authentication Errors

Missing Authorization header: Include Authorization: Bearer halfred_xxxxxxxxxxxxxxxxxxxxxxxxxxxx
Invalid API key: Check that your API key is correct and not revoked
Expired API key: Generate a new API key if yours has expired

Request Errors

Missing or empty messages: You must provide a non-empty messages array
Invalid model: Use a valid model ID from the /models endpoint
Context limit exceeded: Reduce message length or use a profile with larger context
Streaming not supported: The stream parameter is not yet supported

Usage Errors

Insufficient credits: Add credits to your account
Profile usage limit exceeded: Upgrade your plan or wait for limit reset
Rate limit exceeded: Reduce request frequency

Code Examples

cURL

# Get available models
curl -X GET "https://api.halfred.ai/v1/models" \
  -H "Authorization: Bearer halfred_xxxxxxxxxxxxxxxxxxxxxxxxxxxx"

# Chat completion
curl -X POST "https://api.halfred.ai/v1/chat/completions" \
  -H "Authorization: Bearer halfred_xxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "Explain quantum computing in simple terms"
      }
    ],
    "model": "standard",
    "temperature": 0.7
  }'

SDK Implementation

For easier integration, we recommend using our official SDKs instead of making direct HTTP requests:

Node.js / TypeScript SDK - Full implementation examples with TypeScript support
Python SDK - Complete Python implementation with async support

For other languages, see our OpenAI SDK Compatibility Guide.

Best Practices

Model Selection

Use LITE for simple, fast responses and cost-sensitive applications
Use STANDARD for most production applications - it automatically selects the best model
Use DEEPTHINK for complex reasoning, long documents, or strategic tasks
Use DEV only for development and testing (free but not production-ready)

Message Formatting

Always use the messages array for all requests
Include system messages to set context and behavior
Keep conversation history to maintain context across turns
For single prompts, use a messages array with one user message

Error Handling

Always check HTTP status codes
Parse error responses to understand issues
Implement retry logic for transient errors (5xx codes)
Handle rate limiting gracefully

Performance

Cache responses when appropriate
Use appropriate temperature settings (lower for factual, higher for creative)
Monitor token usage to optimize costs
Consider using streaming for long responses (contact support for streaming access)

Support

For additional help:

Check our documentation
Contact support at [email protected]
Join our community Discord

PreviousAPI Reference NextError Handling

Last updated 5 days ago