Best Practices
Optimize your Halfred integration for performance, cost-efficiency, and reliability with proven strategies and patterns.
Profile Selection
Choose the Right Profile for Each Task
// ✅ Good: Match profile to task complexity
async function handleUserRequest(task) {
// Simple UI text
if (task.type === 'autocomplete') {
return await client.chat.completions.create({
model: "lite",
messages: [...]
});
}
// General chat
if (task.type === 'conversation') {
return await client.chat.completions.create({
model: "standard",
messages: [...]
});
}
// Complex analysis
if (task.type === 'analysis') {
return await client.chat.completions.create({
model: "deepthink",
messages: [...]
});
}
}Start Simple, Upgrade When Needed
Cost Optimization
1. Reduce Token Usage
2. Limit Output Length
Tip: Be careful when setting max_tokens too low, as it can cut off responses mid-sentence or incomplete. This is especially important for the STANDARD and DEEPTHINK profiles, which tend to generate longer, more detailed responses. If you notice truncated outputs, try increasing the limit or removing it altogether for these profiles.
3. Cache Common Responses
Security Best Practices
1. Never Expose API Keys
2. Use Server-Side Only
3. Validate User Input
Conversation Management
1. Manage Context Window
2. Summarize Long Histories
Monitoring & Logging
The Halfred dashboard provides real-time visibility into your API usage during development. You can access recent logs to help debug errors and understand request patterns, as well as monitor costs to keep track of your spending as you develop and test your application. These tools are essential for identifying issues early and optimizing your usage before moving to production.
1. Track Token Usage
2. Log Errors
Testing
1. Use DEV Profile for Development
2. Mock API Calls in Tests
Production Checklist
Related Documentation
Support
Need optimization help?
Email: [email protected]
Discord: Join our community
Last updated