Commit Scoring (V2)
The scoring engine transforms commit analysis reports into a single numerical score (0–10). It's designed to be transparent, configurable, and independent of the LLM analysis step.
Design Principles
Multiplicative core — A commit must be both complex and impactful to score high. Simple but impactful changes (like a one-line security fix) get moderate scores, as do complex but low-impact changes (like a large refactor in a trivial area).
Additive bonuses — Effort, quality signals, and risk signals add bonuses on top of the core score, each with caps to prevent runaway inflation.
Full transparency — Every score stores its individual components, so you can always see exactly why a commit scored the way it did.
Re-scorable — Change the weights, switch presets, or create custom configs — re-score your entire history instantly without re-analyzing.
The Formula
Where:
= complexity (1–5 scale, from LLM analysis) = impact (1–5 scale, from LLM analysis) = effort bonus (logarithmic, based on lines changed) = quality bonus (count of quality signals × weight, capped) = risk bonus (count of risk signals × weight, capped)
Normalized Core
The core score uses a multiplicative relationship between complexity and impact:
Why multiplicative? Addition would let a trivial-complexity/high-impact commit score the same as a high-complexity/trivial-impact commit. Multiplication ensures both dimensions must be present.
| Complexity | Impact | Core | Normalized |
|---|---|---|---|
| 1 | 1 | 1 | 0.00 |
| 2 | 2 | 4 | 0.88 |
| 3 | 3 | 9 | 2.33 |
| 4 | 4 | 16 | 4.38 |
| 5 | 5 | 25 | 7.00 |
| 1 | 5 | 5 | 1.17 |
| 5 | 1 | 5 | 1.17 |
Notice that a max-complexity/min-impact commit scores the same as a min-complexity/max-impact commit (1.17) — you need both to score high.
Effort Bonus
A logarithmic function of total lines changed, preventing large diffs from dominating the score:
Where
| Total Lines | Raw Log₂ | Effort (capped at 0.5) |
|---|---|---|
| 10 | 0.35 | 0.35 |
| 50 | 0.57 | 0.50 |
| 100 | 0.67 | 0.50 |
| 500 | 0.90 | 0.50 |
| 1000 | 1.00 | 0.50 |
The logarithmic curve means the first 50 lines contribute nearly as much effort bonus as the next 950. This intentionally de-emphasizes raw volume.
Quality Bonus
Default:
| Signals Present | Bonus |
|---|---|
| 0 | 0.0 |
| 1 | 0.4 |
| 2 | 0.8 |
| 3 | 1.2 |
| 4+ | 1.5 (capped) |
Quality signals: has_tests, good_error_handling, clean_patterns, reduces_tech_debt, good_documentation.
Risk Bonus
Default:
| Signals Present | Bonus |
|---|---|
| 0 | 0.0 |
| 1 | 0.25 |
| 2+ | 0.50 (capped) |
Risk signals: touches_auth, touches_payments, modifies_data_model, cross_module_change, production_hotfix.
Why does risk give a bonus, not a penalty?
Risk signals indicate that a commit touches sensitive, important areas. Doing so successfully is harder and more valuable. The bonus recognizes the additional care required, not the risk itself.
Score Examples
Example 1: Simple bug fix
A one-line fix to a typo in a non-critical utility function.
| Component | Value | Contribution |
|---|---|---|
| Complexity | 1 | — |
| Impact | 1 | — |
| Normalized core | 0.00 | |
| Lines changed | 2 | Effort: 0.16 |
| Quality signals | 0 | Quality: 0.00 |
| Risk signals | 0 | Risk: 0.00 |
| Final score | 0.16 |
Example 2: Feature with tests
A medium-complexity feature adding a new API endpoint with test coverage.
| Component | Value | Contribution |
|---|---|---|
| Complexity | 3 | — |
| Impact | 3 | — |
| Normalized core | 2.33 | |
| Lines changed | 150 | Effort: 0.50 |
| Quality signals | 2 (has_tests, clean_patterns) | Quality: 0.80 |
| Risk signals | 1 (modifies_data_model) | Risk: 0.25 |
| Final score | 3.88 |
Example 3: Critical security refactor
A deep, complex refactor of the authentication system with full test coverage and clean patterns.
| Component | Value | Contribution |
|---|---|---|
| Complexity | 5 | — |
| Impact | 5 | — |
| Normalized core | 7.00 | |
| Lines changed | 400 | Effort: 0.50 |
| Quality signals | 4 (has_tests, good_error_handling, clean_patterns, reduces_tech_debt) | Quality: 1.50 |
| Risk signals | 2 (touches_auth, cross_module_change) | Risk: 0.50 |
| Final score | 9.50 |
Fallback Heuristics
When complexity or impact values are missing from the LLM analysis (e.g., for shallow commits), the scoring engine estimates them from available metadata:
Complexity estimation (capped at 4 — never assigns maximum without LLM confirmation):
| Condition | Bonus |
|---|---|
| Base | 1 |
| Lines changed > 20 | +1 |
| Files changed > 3 | +1 |
| Introduces new pattern | +1 |
| Has migration | +1 |
Impact estimation (capped at 4):
| Condition | Value |
|---|---|
| Domain criticality: low | 1 |
| Domain criticality: medium | 2 |
| Domain criticality: high | 3 |
| Domain criticality: critical | 4 |
| Commit type is feat/fix/perf | +1 |
| Touches core system | +1 |
Scoring Presets
Default
Balanced weights for general-purpose scoring.
effort_weight: 0.5
quality_per_signal: 0.4
quality_cap: 1.5
risk_per_signal: 0.25
risk_cap: 0.5Quality-Focused
Rewards engineering best practices more heavily.
effort_weight: 0.5
quality_per_signal: 0.6 ← +50% per signal
quality_cap: 2.0 ← higher cap
risk_per_signal: 0.25
risk_cap: 0.5Risk-Aware
Gives more credit for working in sensitive areas.
effort_weight: 0.5
quality_per_signal: 0.4
quality_cap: 1.5
risk_per_signal: 0.4 ← +60% per signal
risk_cap: 1.0 ← doubled capV1 Scoring (Legacy)
The original scoring engine used a weighted formula based on commit type and indicator bonuses. It is still available for backward compatibility but V2 is recommended for all new deployments.
V1 type weights:
| Type | Weight |
|---|---|
| feat | 1.0 |
| fix | 0.9 |
| refactor | 0.85 |
| perf | 0.9 |
| test | 0.6 |
| docs | 0.3 |
| chore | 0.3 |
| style | 0.1 |
V1 indicator bonuses:
| Indicator | Bonus |
|---|---|
| touches_core_system | 3.0 |
| introduces_new_pattern | 2.0 |
| has_migration | 1.5 |
| dependencies_changed | 1.0 |
| new_modules_created | 1.0 per module |
| tests_added | 0.5 per test |
V1 domain multipliers:
| Domain Criticality | Multiplier |
|---|---|
| critical | 1.5× |
| high | 1.2× |
| medium | 1.0× |
| low | 0.7× |
| trivial | 0.3× |
