Gaming Detection

ShipLens includes five pattern detectors that identify potentially artificial commit patterns. These detectors operate on a strict principle: flags only, never penalties.

Philosophy

Any scoring system creates incentives. Some engineers might — consciously or not — adapt their commit patterns to inflate scores. Gaming detection exists to surface these patterns for human review, not to automate punishment.

Important

Gaming flags are informational. They never automatically reduce scores, block commits, or trigger negative consequences. They surface patterns that managers should investigate in a 1:1 conversation.

The Five Detectors

1. Commit Splitting

What it detects: Multiple commits to overlapping files in a short time window, suggesting a single logical change was artificially split into multiple commits.

Parameter	Value
Time window	30 minutes
Minimum commits	3

How it works:

Group commits by contributor within a 30-minute sliding window
For each group of 3+ commits, check for file overlap
If multiple commits touch the same files within the window, flag all of them

Why this matters: Splitting a single feature into many small commits can inflate volume-based metrics. However, some developers legitimately make rapid-fire commits as part of their workflow — context matters.

2. Add-Remove Cycles

What it detects: Paired commits where one adds lines and a follow-up removes a significant portion from the same files.

Parameter	Value
Minimum lines added	20
Removal threshold	50% of lines added

How it works:

For each pair of commits from the same contributor:
- Commit A adds ≥ 20 lines to a set of files
- Commit B removes ≥ 50% of those added lines from the same files
Flag both commits

Pattern example:

Commit A: Add 100 lines to auth.ex       ← flagged
Commit B: Remove 60 lines from auth.ex   ← flagged

Why this matters: This pattern can indicate code being added just to be removed, inflating both volume and churn metrics. It can also be legitimate iteration — the flag prompts investigation.

3. False Refactors

What it detects: Commits labeled as refactors or style changes that are actually just formatting or whitespace changes, claiming credit for significant work.

Parameter	Value
Minimum lines changed	50
Trigger keywords	`format`, `whitespace`, `indent`, `rename`, `style`

How it works:

Check if commit type is refactor or style
Check if lines changed > 50
Check if commit message or summary contains formatting keywords
If all three conditions are met, flag the commit

Why this matters: A "refactor" that's actually just running a code formatter shouldn't be scored the same as a genuine architectural refactor. The flag distinguishes between meaningful and cosmetic changes.

4. Abnormal Frequency

What it detects: Days where a contributor's commit count is dramatically higher than their personal average.

Parameter	Value
Multiplier threshold	3× daily average
Minimum history	2 days

How it works:

Calculate the contributor's average daily commit count across the analysis period
For each day where daily_count ≥ average × 3, flag all commits on that day

Example:

Average: 4 commits/day
Monday: 3 commits  ← normal
Tuesday: 15 commits ← flagged (15 ≥ 4 × 3 = 12)

Why this matters: Occasional high-output days are normal (release day, hackathon). Persistent abnormal frequency might indicate artificial inflation. The 3× threshold is generous to avoid false positives.

5. Trivial Splits

What it detects: Many tiny commits on the same day touching the same domain, suggesting a single small change was split into trivial pieces.

Parameter	Value
Maximum lines per commit	5
Minimum commits to flag	5

How it works:

Group commits by contributor, day, and domain
For each group: count commits where total_lines ≤ 5
If there are ≥ 5 such commits in the same group, flag all of them

Example:

09:00 - fix typo in user.ex (2 lines)    ← flagged
09:05 - fix typo in user.ex (1 line)     ← flagged
09:10 - rename var in user.ex (3 lines)  ← flagged
09:15 - fix spacing in user.ex (1 line)  ← flagged
09:20 - fix comment in user.ex (2 lines) ← flagged

Why this matters: Each of these could have been a single commit. Splitting them artificially inflates commit count without adding meaningful value.

How Flags Are Stored

Gaming flags are stored as a list of strings on the CommitReport:

elixir

gaming_flags: ["commit_splitting", "add_remove_cycle", "false_refactor",
               "abnormal_frequency", "trivial_splits"]

A single commit can have multiple flags. Flags are surfaced in:

Contributor profiles — Summary of flags by type
Weekly digests — Risk signals section
1:1 reports — Topics to explore (when patterns are persistent)
Alerts page — Aggregated view across all contributors

False Positive Considerations

Every detector can produce false positives in legitimate scenarios:

Detector	Legitimate Scenario
Commit splitting	TDD red-green-refactor cycle
Add-remove cycles	Iterative prototyping with rollbacks
False refactors	Genuine formatting migration (e.g., adopting a new linter)
Abnormal frequency	Release day, hackathon, catching up after vacation
Trivial splits	Git bisect-friendly commit strategy

This is exactly why flags never auto-penalize. They're conversation starters, not verdicts.

Gaming Detection ​

Philosophy ​

The Five Detectors ​

1. Commit Splitting ​

2. Add-Remove Cycles ​

3. False Refactors ​

4. Abnormal Frequency ​

5. Trivial Splits ​

How Flags Are Stored ​

False Positive Considerations ​

Gaming Detection

Philosophy

The Five Detectors

1. Commit Splitting

2. Add-Remove Cycles

3. False Refactors

4. Abnormal Frequency

5. Trivial Splits

How Flags Are Stored

False Positive Considerations