# Best Practices

## Profile Selection

### Choose the Right Profile for Each Task

```javascript
// ✅ Good: Match profile to task complexity
async function handleUserRequest(task) {
  // Simple UI text
  if (task.type === 'autocomplete') {
    return await client.chat.completions.create({
      model: "lite",
      messages: [...]
    });
  }

  // General chat
  if (task.type === 'conversation') {
    return await client.chat.completions.create({
      model: "standard",
      messages: [...]
    });
  }

  // Complex analysis
  if (task.type === 'analysis') {
    return await client.chat.completions.create({
      model: "deepthink",
      messages: [...]
    });
  }
}
```

### Start Simple, Upgrade When Needed

```javascript
// Try lite first for cost efficiency
let completion;
try {
  completion = await client.chat.completions.create({
    model: "lite",
    messages: [...]
  });

  // Check quality (implement your own logic)
  if (!isQualitySufficient(completion)) {
    // Upgrade to standard if needed
    completion = await client.chat.completions.create({
      model: "standard",
      messages: [...]
    });
  }
} catch (error) {
  console.error(error);
}
```

## Cost Optimization

### 1. Reduce Token Usage

```javascript
// ❌ Verbose prompts
const prompt = "I would like you to please provide me with a detailed and comprehensive explanation of...";

// ✅ Concise prompts
const prompt = "Explain how...";
```

### 2. Limit Output Length

```javascript
await client.chat.completions.create({
  model: "lite",
  messages: [{ role: "user", content: "Summarize this article" }],
  max_tokens: 150, // Prevent overly long responses
});
```

**Tip:** Be careful when setting `max_tokens` too low, as it can cut off responses mid-sentence or incomplete. This is especially important for the **STANDARD** and **DEEPTHINK** profiles, which tend to generate longer, more detailed responses. If you notice truncated outputs, try increasing the limit or removing it altogether for these profiles.

### 3. Cache Common Responses

```javascript
const cache = new Map();

async function getCachedCompletion(prompt, model = "standard") {
  const key = `${model}:${prompt}`;

  if (cache.has(key)) {
    return cache.get(key);
  }

  const completion = await client.chat.completions.create({
    model,
    messages: [{ role: "user", content: prompt }],
  });

  cache.set(key, completion);
  return completion;
}
```

## Security Best Practices

### 1. Never Expose API Keys

```javascript
// ❌ Bad: Hardcoded key
const client = new Halfred({
  apiKey: "halfred_xxxxxxxxxxxxxxxxxxxxxxxxxxxx",
});

// ✅ Good: Environment variable
const client = new Halfred({
  apiKey: process.env.HALFRED_API_KEY,
});
```

### 2. Use Server-Side Only

```javascript
// ✅ Backend API route
app.post("/api/chat", async (req, res) => {
  const completion = await client.chat.completions.create({
    model: "standard",
    messages: req.body.messages,
  });

  res.json(completion);
});

// ❌ Never in frontend JavaScript
// const client = new Halfred({ apiKey: "..." }); // WRONG!
```

### 3. Validate User Input

```javascript
function validateMessages(messages) {
  if (!Array.isArray(messages) || messages.length === 0) {
    throw new Error("Invalid messages");
  }

  // Limit message length
  for (const msg of messages) {
    if (msg.content.length > 10000) {
      throw new Error("Message too long");
    }
  }

  return messages;
}

const validated = validateMessages(userInput);
const completion = await client.chat.completions.create({
  model: "standard",
  messages: validated,
});
```

## Conversation Management

### 1. Manage Context Window

```javascript
function trimConversation(messages, maxMessages = 20) {
  if (messages.length <= maxMessages) {
    return messages;
  }

  // Keep system message + recent messages
  const systemMsg = messages.find((m) => m.role === "system");
  const recent = messages.slice(-maxMessages);

  return systemMsg ? [systemMsg, ...recent] : recent;
}
```

### 2. Summarize Long Histories

```javascript
async function summarizeHistory(oldMessages) {
  const summary = await client.chat.completions.create({
    model: "lite",
    messages: [
      {
        role: "user",
        content: `Summarize this conversation: ${JSON.stringify(oldMessages)}`,
      },
    ],
    max_tokens: 200,
  });

  return {
    role: "system",
    content: `Previous conversation: ${summary.choices[0].message.content}`,
  };
}
```

## Monitoring & Logging

The Halfred dashboard provides real-time visibility into your API usage during development. You can access recent logs to help debug errors and understand request patterns, as well as monitor costs to keep track of your spending as you develop and test your application. These tools are essential for identifying issues early and optimizing your usage before moving to production.

### 1. Track Token Usage

```javascript
let totalTokens = 0;
let totalCost = 0;

async function trackCompletion(params) {
  const completion = await client.chat.completions.create(params);

  console.log(`Tokens: ${completion.usage.total_tokens}`);

  return completion;
}
```

### 2. Log Errors

```javascript
try {
  const completion = await client.chat.completions.create({...});
} catch (error) {
  console.error({
    timestamp: new Date().toISOString(),
    status: error.status,
    code: error.code,
    message: error.message,
    model: "standard"
  });

  // Send to monitoring service
  // Sentry.captureException(error);
}
```

## Testing

### 1. Use DEV Profile for Development

```javascript
const model = process.env.NODE_ENV === 'production'
  ? 'standard'
  : 'dev';

await client.chat.completions.create({
  model,
  messages: [...]
});
```

### 2. Mock API Calls in Tests

```javascript
// Jest example
jest.mock("halfred.ai");

test("handles completion correctly", async () => {
  const mockCreate = jest.fn().mockResolvedValue({
    choices: [{ message: { content: "Test response" } }],
  });

  Halfred.mockImplementation(() => ({
    chat: {
      completions: { create: mockCreate },
    },
  }));

  // Test your code
});
```

## Production Checklist

* [ ] API keys stored securely in environment variables
* [ ] Monitoring and alerting configured
* [ ] Appropriate profiles chosen for each use case
* [ ] Input validation implemented
* [ ] Never expose API keys in client-side code
* [ ] DEV profile used for testing only
* [ ] Conversation history managed efficiently

## Related Documentation

* [Model Profiles](/03-concepts/01-profiles.md)
* [Pricing & Credits](/03-concepts/02-pricing.md)
* [Error Handling](/04-api-reference/02-errors.md)
* [Authentication](/01-getting-started/02-authentication.md)

## Support

Need optimization help?

* **Email**: <support@halfred.ai>
* **Discord**: [Join our community](https://discord.gg/wS2awX4EV7)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.halfred.ai/05-advanced/01-best-practices.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
