Before/After Analysis

Every process change, team restructure, or tooling investment is a bet. Before/After analysis tells you whether the bet paid off — with data, not opinions.

Purpose

CTOs regularly make interventions: adopting a new branching strategy, restructuring squads, introducing code review requirements, switching CI providers. These changes are expensive in terms of team disruption, and their impact is usually assessed through gut feeling or anecdote.

Before/After analysis selects two time periods and compares engineering metrics across them, giving you a quantified answer to: "Did this change actually improve things?"

How It Works

Select the intervention date — The point in time when the change took effect.
Define the "before" period — A window before the intervention (default: 28 days).
Define the "after" period — A window after the intervention (default: 28 days).
Compare — ShipLens computes each metric for both periods and shows the delta.

|<-- Before (28 days) -->|<-- Intervention -->|<-- After (28 days) -->|

TIP

Choosing equal-length periods is important for fair comparison. ShipLens defaults to 28 days (4 weeks) to capture a full sprint cycle and smooth out weekly patterns. Shorter periods increase noise; longer periods may include confounding factors.

Compared Metrics

Five metrics are compared across the two periods:

1. Velocity

Δ velocity = \frac{commits/week (after) - commits/week (before)}{commits/week (before)} \times 100 %

Change	Interpretation
> +15%	Significant increase in throughput
-15% to +15%	No meaningful change
< -15%	Significant decrease — expected during adjustment periods

2. Average Score

Δ score = {\bar{s}}_{after} - {\bar{s}}_{before}

Compared as an absolute difference (not percentage) since scores are on a fixed 0-10 scale.

Change	Interpretation
> +0.5	Meaningful quality improvement
-0.5 to +0.5	No meaningful change
< -0.5	Quality regression

3. Commit Frequency

frequency = \frac{total commits}{active contributors \times working days}

Measured as commits per contributor per day. Normalized by active contributor count to account for team size changes.

4. Type Distribution

The proportion of each commit type (feat, fix, refactor, test, docs, chore, style, perf) in each period:

Δ type_share = {type %}_{after} - {type %}_{before}

What to look for:

Increase in feat share after removing process bottlenecks
Decrease in fix share after improving test practices
Increase in refactor share after dedicating tech debt sprints

5. Slop Index

Δ slop = {\bar{slop}}_{after} - {\bar{slop}}_{before}

Tracks whether AI-generated code quality changed after the intervention.

Statistical Comparison

ShipLens goes beyond comparing averages. For each metric, the comparison includes:

Statistical Measure	Purpose
Mean	Central tendency — the "typical" value
Median	Robust central tendency — unaffected by outliers
Standard deviation	Spread — how consistent the metric is
Distribution chart	Visual comparison of the full distribution, not just the center

Why distributions matter: An average score increase from 3.5 to 4.0 could mean everyone improved slightly — or it could mean one person started scoring 9s while everyone else stayed the same. The distribution chart reveals which scenario is happening.

Effect Size

For each metric, ShipLens computes Cohen's d to measure effect size:

d = \frac{{\bar{x}}_{after} - {\bar{x}}_{before}}{s_{pooled}}

Where:

s_{pooled} = \sqrt{\frac{s_{before}^{2} + s_{after}^{2}}{2}}

Cohen's d	Interpretation
< 0.2	Negligible effect
0.2 - 0.5	Small effect
0.5 - 0.8	Medium effect
> 0.8	Large effect

This matters because a statistically "significant" change in velocity might be practically meaningless (e.g., +2 commits/week on a base of 200). Cohen's d tells you whether the change is large enough to matter.

Use Cases

Sprint Retrospective

Compare the current sprint against the previous sprint:

Did velocity hold steady?
Did the fix ratio decrease (indicating fewer bugs)?
Did the new code review process improve average scores?

Process Change

Adopted trunk-based development? Compare the 4 weeks before and after:

Expected: higher commit frequency, lower cycle time
Watch for: score regression (speed vs quality tradeoff)

Team Reorganization

Restructured squads? Compare performance before and after:

Allow a 2-week adjustment period before starting the "after" window
Compare at both squad and individual contributor level

Tooling Investment

Introduced a new testing framework or CI pipeline? Measure the impact:

Expected: increase in test type commits, decrease in fix ratio
Timeline: may take 4-8 weeks to show measurable impact

Route

/c/:slug/before-after

The before/after page shows:

Date picker for intervention point and period lengths
Side-by-side metric comparison with deltas and effect sizes
Distribution charts for each metric (before vs after overlay)
Type distribution stacked bar comparison
Summary card with overall assessment (improved / no change / declined)

Before/After Analysis ​

Purpose ​

How It Works ​

Compared Metrics ​

1. Velocity ​

2. Average Score ​

3. Commit Frequency ​

4. Type Distribution ​

5. Slop Index ​

Statistical Comparison ​

Effect Size ​

Use Cases ​

Sprint Retrospective ​

Process Change ​

Team Reorganization ​

Tooling Investment ​

Route ​