Skip to content

Commit Sizing

Commit sizing categorizes commits by their total lines changed, providing insight into how developers structure their work.

Size Buckets

Total lines = lines added + lines removed.

SizeLines ChangedTypical Content
S (Small)< 100Bug fixes, config changes, small features
M (Medium)100 – 500Standard features, meaningful refactors
L (Large)501 – 1,500Major features, significant refactors
XL (Extra Large)> 1,500Migrations, initial implementations, generated code

Complexity Smell Detection

In addition to size, commits are checked for a complexity smell:

complexity_smell=files_changed>8

A commit touching more than 8 files, regardless of line count, gets flagged as having a complexity smell. This catches wide-but-shallow changes that might indicate:

  • Shotgun surgery (a change requiring edits across many files)
  • Incomplete abstractions
  • Cross-cutting concerns that should be centralized

Metrics

For each contributor over a date range:

MetricDescription
distributionCount of commits per size bucket (S, M, L, XL)
complexity_smellsCount of commits with > 8 files changed
percentagesPercentage breakdown across buckets

Healthy Patterns

Mostly S and M commits

A contributor whose work is primarily small and medium commits is likely:

  • Practicing good commit discipline
  • Breaking work into reviewable chunks
  • Shipping incrementally

Frequent L and XL commits

Frequent large commits might indicate:

  • Difficulty decomposing work (worth coaching on)
  • Generated code or migrations (legitimate, context-dependent)
  • Long-running branches merged all at once
  • Infrequent committing habits

High complexity smell rate

Many commits touching 8+ files suggest:

  • The codebase may need better abstractions
  • Work is cross-cutting and tightly coupled
  • Or the developer is making sweeping changes (refactors, renames)

Best practice

The ideal distribution is heavily weighted toward S and M, with occasional L commits for significant features and rare XL commits for migrations or initial setups. Use the distribution to guide conversations about work decomposition.

Built with intelligence, not surveillance.