Skip to content

LLM Integration

ShipLens uses Anthropic's Claude models for all AI-powered features. The integration is designed for cost efficiency, transparency, and reliability.

Model Selection

ModelUsed ForCost (per 1M tokens)
Claude HaikuStandard commit analysis, triage contextInput: $0.80 / Output: $4.00
Claude SonnetDeep analysis, weekly digests, 1:1 reportsInput: $3.00 / Output: $15.00

Why Two Models?

The dual-model approach balances cost and quality:

  • Haiku handles ~70-80% of commits (standard depth) at very low cost. For a typical commit analysis, this costs $0.001–0.003.
  • Sonnet is reserved for tasks where quality matters most: security-sensitive code, generated reports, and agentic deep analysis.

This typically results in a 10–20× cost reduction compared to using Sonnet for everything.

Integration Architecture

Client Abstraction

The LLM client (llm/client.ex) provides two main functions:

FunctionDescription
complete(prompt, model)Standard prompt → response
complete_with_tools(prompt, tools, model)Prompt with tool definitions for agentic use

Both return a Response struct containing:

  • content — The LLM's response text
  • model — Which model was used
  • input_tokens / output_tokens — Token counts
  • cost_usd — Calculated cost

Cost Tracking

Every LLM call has its cost tracked and stored:

  • Commit reports store cost_usd per analysis
  • Weekly digests store cost_usd per generation
  • 1:1 reports store cost_usd per report

This enables full cost visibility and budgeting.

Prompt Engineering

All prompts are centralized in llm/prompts.ex and share common design principles:

Structured JSON Output

Every analysis prompt requests a specific JSON schema:

json
{
  "commit_type": "feat",
  "summary": "...",
  "areas_affected": ["auth", "api"],
  "complexity": 3,
  "impact": 4,
  "quality_signals": ["has_tests", "clean_patterns"],
  "risk_signals": ["touches_auth"],
  "slop_dimensions": {
    "verbosity": 1,
    "unnecessary_comments": 0,
    "over_engineering": 0,
    "defensive_bloat": 0,
    "style_mismatch": 0,
    "redundancy": 0,
    "scope_creep": 0
  },
  "confidence": 0.85
}

Project Context Injection

Standard analysis prompts include project context from the vector store:

You are analyzing a commit in the context of this project:
[2000 chars of relevant project context from RAG]

Commit: [sha]
Message: [message]
Diff: [compressed diff]

Agentic Deep Analysis

Deep analysis prompts give the LLM access to codebase tools:

ToolCapability
read_file(path)Read any file in the repository
search_codebase(query)Search for patterns across files
list_directory(path)List directory contents
get_file_at_commit(path, sha)Read a file at a specific commit

The agentic loop runs for up to 10 turns with a $0.50 cost cap per commit.

Self-Reflection

After initial analysis, a self-reflection step can refine the LLM's assessment:

  • Reconsider confidence level
  • Verify gaming flag appropriateness
  • Check for overlooked quality/risk signals

Error Handling

ScenarioBehavior
API timeoutRetry with exponential backoff (Oban handles this)
Rate limitJob re-queued for later processing
Invalid JSON responseParsed best-effort; fallback to heuristic scoring
Cost cap exceeded (deep)Analysis stops, returns partial results

Typical Costs

OperationPer UnitMonthly (10-person team)
Standard commit analysis$0.001–0.003$2–15
Deep commit analysis$0.01–0.50$1–10
Weekly digest$0.003–0.01$0.10–0.40
1:1 report$0.01–0.05$1–5
Total$4–30/month

Built with intelligence, not surveillance.