PR Enrichment & Auto-Linking

Commits tell you what changed. Pull requests tell you why, who reviewed it, and how the team collaborated. ShipLens links these two signals together automatically, giving you a fuller picture of engineering output.

Commit-PR Auto-Linking

When ShipLens syncs a repository, it matches commits to their originating pull requests using two strategies:

Strategy	How It Works	Coverage
Merge commit parsing	Extracts PR number from merge commit messages (e.g., `Merge pull request #42`)	~80% of PRs
SHA matching	Queries the provider API for PRs containing a specific commit SHA	Catches squash merges, rebases

Both strategies run during sync. If a commit matches multiple PRs (e.g., cherry-picked across branches), all associations are recorded.

PR Schema

Each synced pull request stores the following:

Field	Type	Description
`provider`	enum	`github`, `bitbucket`
`number`	integer	PR number within the repository
`title`	string	PR title as written by the author
`state`	enum	`open`, `merged`, `closed`
`author`	string	Login of the PR author
`reviewers`	list	Logins of all reviewers (requested + actual)
`additions`	integer	Lines added across all commits
`deletions`	integer	Lines removed across all commits
`review_comments`	integer	Count of review comments (not general comments)
`merged_at`	datetime	When the PR was merged (nil if not merged)
`created_at`	datetime	When the PR was opened

PR Analysis

Pull requests are analyzed by the LLM in the same way commits are — but with richer context. The PR analysis prompt includes the title, description, file list, and review discussion, producing:

Field	Type	Description
`summary`	string	What this PR accomplishes in plain language
`complexity`	1–5	Complexity of the overall change
`impact`	1–5	Impact on the system
`areas_affected`	list	Domains and modules touched

PR analysis uses Claude Haiku by default. Cost: ~$0.002–0.005 per PR.

PR Scoring

PR scoring is a separate engine from commit scoring. This is deliberate — a PR represents a unit of shipped work, while a commit represents a unit of effort. Conflating them would muddy both signals.

The PR scoring engine evaluates:

Dimension	What It Measures
Size discipline	Is the PR appropriately scoped? (penalizes both trivial and massive PRs)
Review quality	Did reviewers engage meaningfully? (comment count, back-and-forth)
Cycle time	How long from open to merge?
Description quality	Did the author explain the change? (LLM-assessed)

PR scores are not averaged into contributor scores. They're a separate lens.

Cycle Time Analysis

Cycle time is measured per PR and aggregated per squad:

cycle_time = merged_at - created_at

ShipLens breaks cycle time into percentiles per squad:

Metric	Description
`p50`	Median cycle time — the "typical" PR
`p75`	75th percentile — where slowdowns start
`p90`	90th percentile — outliers that may indicate blockers

Cycle time trends are tracked week-over-week. A sustained increase in p75 often signals process bottlenecks, unclear ownership, or review bandwidth issues.

PR Size Distribution

PR size is categorized by total lines changed (additions + deletions):

Size	Lines Changed	Ideal %
S	1–50	30–40%
M	51–200	40–50%
L	201–500	10–15%
XL	500+	< 5%

Large PR Warning

Teams where more than 20% of PRs are L or XL consistently may be struggling with incremental delivery. Large PRs are harder to review, more likely to introduce bugs, and slower to merge.

Collaboration Metrics

ShipLens tracks collaboration patterns derived from PR activity:

Metric	Formula	What It Reveals
Review participation	$\frac{PRs reviewed}{total team PRs}$	Is review load shared or concentrated?
Comment density	$\frac{review comments}{PRs reviewed}$	Depth of review engagement
Self-merge ratio	$\frac{PRs merged without review}{total PRs}$	Process discipline

These metrics are surfaced at the squad level and in weekly digests.

Provider Sync

GitHub

GitHub PRs are synced via the GitHub REST API during repository sync. The sync fetches:

All merged PRs since the last sync point
Associated reviews and review comments
Commit SHAs for linking

Rate limiting is handled with exponential backoff. GitHub's API returns up to 100 PRs per page.

Bitbucket

Bitbucket PRs are synced via the Bitbucket REST API (v2). The same data is collected, with field mappings adjusted for Bitbucket's schema (e.g., participants instead of reviewers).

Ticket Reference Extraction

PR titles and descriptions are scanned for ticket references using provider-specific patterns:

Provider	Pattern	Example
Jira	`[A-Z]{2,}-\d+`	`PROJ-1234`
Linear	`[A-Z]{2,}-\d+`	`ENG-456`

Extracted references are stored as structured data, not parsed further. ShipLens does not call Jira or Linear APIs during extraction — it captures the reference for display and future integration.

TIP

Ticket extraction is intentionally lightweight. When full ticket integration is needed (status, story points, sprint data), use the dedicated Jira or Linear integrations rather than parsing PR text.

PR Enrichment & Auto-Linking ​

Commit-PR Auto-Linking ​

PR Schema ​

PR Analysis ​

PR Scoring ​

Cycle Time Analysis ​

PR Size Distribution ​

Collaboration Metrics ​

Provider Sync ​

GitHub ​

Bitbucket ​

Ticket Reference Extraction ​