LLM Integration
ShipLens uses Anthropic's Claude models for all AI-powered features. The integration is designed for cost efficiency, transparency, and reliability.
Model Selection
| Model | Used For | Cost (per 1M tokens) |
|---|---|---|
| Claude Haiku | Standard commit analysis, triage context | Input: $0.80 / Output: $4.00 |
| Claude Sonnet | Deep analysis, weekly digests, 1:1 reports | Input: $3.00 / Output: $15.00 |
Why Two Models?
The dual-model approach balances cost and quality:
- Haiku handles ~70-80% of commits (standard depth) at very low cost. For a typical commit analysis, this costs $0.001–0.003.
- Sonnet is reserved for tasks where quality matters most: security-sensitive code, generated reports, and agentic deep analysis.
This typically results in a 10–20× cost reduction compared to using Sonnet for everything.
Integration Architecture
Client Abstraction
The LLM client (llm/client.ex) provides two main functions:
| Function | Description |
|---|---|
complete(prompt, model) | Standard prompt → response |
complete_with_tools(prompt, tools, model) | Prompt with tool definitions for agentic use |
Both return a Response struct containing:
content— The LLM's response textmodel— Which model was usedinput_tokens/output_tokens— Token countscost_usd— Calculated cost
Cost Tracking
Every LLM call has its cost tracked and stored:
- Commit reports store
cost_usdper analysis - Weekly digests store
cost_usdper generation - 1:1 reports store
cost_usdper report
This enables full cost visibility and budgeting.
Prompt Engineering
All prompts are centralized in llm/prompts.ex and share common design principles:
Structured JSON Output
Every analysis prompt requests a specific JSON schema:
{
"commit_type": "feat",
"summary": "...",
"areas_affected": ["auth", "api"],
"complexity": 3,
"impact": 4,
"quality_signals": ["has_tests", "clean_patterns"],
"risk_signals": ["touches_auth"],
"slop_dimensions": {
"verbosity": 1,
"unnecessary_comments": 0,
"over_engineering": 0,
"defensive_bloat": 0,
"style_mismatch": 0,
"redundancy": 0,
"scope_creep": 0
},
"confidence": 0.85
}Project Context Injection
Standard analysis prompts include project context from the vector store:
You are analyzing a commit in the context of this project:
[2000 chars of relevant project context from RAG]
Commit: [sha]
Message: [message]
Diff: [compressed diff]Agentic Deep Analysis
Deep analysis prompts give the LLM access to codebase tools:
| Tool | Capability |
|---|---|
read_file(path) | Read any file in the repository |
search_codebase(query) | Search for patterns across files |
list_directory(path) | List directory contents |
get_file_at_commit(path, sha) | Read a file at a specific commit |
The agentic loop runs for up to 10 turns with a $0.50 cost cap per commit.
Self-Reflection
After initial analysis, a self-reflection step can refine the LLM's assessment:
- Reconsider confidence level
- Verify gaming flag appropriateness
- Check for overlooked quality/risk signals
Error Handling
| Scenario | Behavior |
|---|---|
| API timeout | Retry with exponential backoff (Oban handles this) |
| Rate limit | Job re-queued for later processing |
| Invalid JSON response | Parsed best-effort; fallback to heuristic scoring |
| Cost cap exceeded (deep) | Analysis stops, returns partial results |
Typical Costs
| Operation | Per Unit | Monthly (10-person team) |
|---|---|---|
| Standard commit analysis | $0.001–0.003 | $2–15 |
| Deep commit analysis | $0.01–0.50 | $1–10 |
| Weekly digest | $0.003–0.01 | $0.10–0.40 |
| 1:1 report | $0.01–0.05 | $1–5 |
| Total | $4–30/month |
