Skip to content

Before/After Analysis

Every process change, team restructure, or tooling investment is a bet. Before/After analysis tells you whether the bet paid off — with data, not opinions.

Purpose

CTOs regularly make interventions: adopting a new branching strategy, restructuring squads, introducing code review requirements, switching CI providers. These changes are expensive in terms of team disruption, and their impact is usually assessed through gut feeling or anecdote.

Before/After analysis selects two time periods and compares engineering metrics across them, giving you a quantified answer to: "Did this change actually improve things?"

How It Works

  1. Select the intervention date — The point in time when the change took effect.
  2. Define the "before" period — A window before the intervention (default: 28 days).
  3. Define the "after" period — A window after the intervention (default: 28 days).
  4. Compare — ShipLens computes each metric for both periods and shows the delta.
|<-- Before (28 days) -->|<-- Intervention -->|<-- After (28 days) -->|

TIP

Choosing equal-length periods is important for fair comparison. ShipLens defaults to 28 days (4 weeks) to capture a full sprint cycle and smooth out weekly patterns. Shorter periods increase noise; longer periods may include confounding factors.

Compared Metrics

Five metrics are compared across the two periods:

1. Velocity

Δvelocity=commits/week (after)commits/week (before)commits/week (before)×100%
ChangeInterpretation
> +15%Significant increase in throughput
-15% to +15%No meaningful change
< -15%Significant decrease — expected during adjustment periods

2. Average Score

Δscore=s¯afters¯before

Compared as an absolute difference (not percentage) since scores are on a fixed 0-10 scale.

ChangeInterpretation
> +0.5Meaningful quality improvement
-0.5 to +0.5No meaningful change
< -0.5Quality regression

3. Commit Frequency

frequency=total commitsactive contributors×working days

Measured as commits per contributor per day. Normalized by active contributor count to account for team size changes.

4. Type Distribution

The proportion of each commit type (feat, fix, refactor, test, docs, chore, style, perf) in each period:

Δtype_share=type %aftertype %before

What to look for:

  • Increase in feat share after removing process bottlenecks
  • Decrease in fix share after improving test practices
  • Increase in refactor share after dedicating tech debt sprints

5. Slop Index

Δslop=slop¯afterslop¯before

Tracks whether AI-generated code quality changed after the intervention.

Statistical Comparison

ShipLens goes beyond comparing averages. For each metric, the comparison includes:

Statistical MeasurePurpose
MeanCentral tendency — the "typical" value
MedianRobust central tendency — unaffected by outliers
Standard deviationSpread — how consistent the metric is
Distribution chartVisual comparison of the full distribution, not just the center

Why distributions matter: An average score increase from 3.5 to 4.0 could mean everyone improved slightly — or it could mean one person started scoring 9s while everyone else stayed the same. The distribution chart reveals which scenario is happening.

Effect Size

For each metric, ShipLens computes Cohen's d to measure effect size:

d=x¯afterx¯beforespooled

Where:

spooled=sbefore2+safter22
Cohen's dInterpretation
< 0.2Negligible effect
0.2 - 0.5Small effect
0.5 - 0.8Medium effect
> 0.8Large effect

This matters because a statistically "significant" change in velocity might be practically meaningless (e.g., +2 commits/week on a base of 200). Cohen's d tells you whether the change is large enough to matter.

Use Cases

Sprint Retrospective

Compare the current sprint against the previous sprint:

  • Did velocity hold steady?
  • Did the fix ratio decrease (indicating fewer bugs)?
  • Did the new code review process improve average scores?

Process Change

Adopted trunk-based development? Compare the 4 weeks before and after:

  • Expected: higher commit frequency, lower cycle time
  • Watch for: score regression (speed vs quality tradeoff)

Team Reorganization

Restructured squads? Compare performance before and after:

  • Allow a 2-week adjustment period before starting the "after" window
  • Compare at both squad and individual contributor level

Tooling Investment

Introduced a new testing framework or CI pipeline? Measure the impact:

  • Expected: increase in test type commits, decrease in fix ratio
  • Timeline: may take 4-8 weeks to show measurable impact

Route

/c/:slug/before-after

The before/after page shows:

  • Date picker for intervention point and period lengths
  • Side-by-side metric comparison with deltas and effect sizes
  • Distribution charts for each metric (before vs after overlay)
  • Type distribution stacked bar comparison
  • Summary card with overall assessment (improved / no change / declined)

Built with intelligence, not surveillance.