Reference Tables

Code Quality Benchmarks 2026

Definitive threshold tables for every major code quality metric. No vague advice. Colour-coded ratings with context by project maturity, language, and team size.

Updated 16 April 2026

Master Benchmark Table

Metric	Excellent	Acceptable	Warning	Critical
Technical Debt Ratio	< 5%	5 - 10%	10 - 20%	> 20%
Code Coverage	> 80%	60 - 80%	40 - 60%	< 40%
Cognitive Complexity (per function)	< 8	8 - 15	15 - 25	> 25
Cyclomatic Complexity (per function)	< 10	10 - 20	20 - 40	> 40
Code Duplication	< 3%	3 - 5%	5 - 10%	> 10%
Dependency Freshness (major versions behind)	0	1	2 - 3	> 3
Security Vulnerabilities	0	Low only	Any medium	Any high/critical
Code Smell Density (per 1K LoC)	< 5	5 - 15	15 - 30	> 30

Code Coverage Benchmarks

The 80% coverage rule is a floor, not a ceiling. What matters is what you cover, not just the percentage. Business logic should be near 100%. UI rendering can be lower. Legacy code improving from 20% to 40% is a bigger win than going from 80% to 85%.

Startup / MVP50 - 60%Cover critical paths and core business logic. Speed of iteration matters more than exhaustive coverage at this stage.

Growth Product70 - 80%Team is growing. New engineers need tests as documentation. API boundaries and data transformations should be fully covered.

Enterprise80 - 90%Compliance and audit requirements. Deployment confidence requires high coverage. Focus on integration and contract tests.

Safety-Critical90%+Regulatory requirement in many jurisdictions. Medical, automotive, aviation, and financial systems. Every code path must be tested.

Legacy (improving)Add 5% per quarterAbsolute number matters less than the trend. Cover new code at 80%+ and increase overall coverage incrementally.

The 80% myth

80% is not a magic number. It became the industry default because quality gates needed a threshold, and 80% was a reasonable compromise. For business logic, 80% is too low. For generated code, UI components, and boilerplate, 80% may be unnecessarily expensive. Target by module, not by project.

Complexity Benchmarks

Complexity correlates directly with bug density. Research shows that functions with cyclomatic complexity above 20 have 4-5x the bug rate of functions below 10. Cognitive complexity (used by SonarQube) is a better predictor because it penalises nesting depth.

Per-Function Thresholds

Cognitive complexity< 15

Cyclomatic complexity< 20

Function length< 60 lines

Parameters< 5

Nesting depth< 4 levels

Per-File Thresholds

File length< 400 lines

Classes per file1 (SRP)

Functions per file< 20

Import statements< 30

Total file complexity< 100

Duplication Benchmarks

Zero duplication is not the goal. Some duplication is preferable to premature abstraction. The cost of duplication is not the extra lines, but the bugs that appear when one copy is updated and the others are not. Target reduction of high-duplication clusters, not elimination of all repeated code.

Codebase Age	Normal Range	Context
New (< 1 year)	1 - 3%	Small codebase, fresh patterns. Above 3% suggests copy-paste development habits.
Growing (1 - 3 years)	3 - 6%	Some organic duplication from rapid feature development. Normal if trending stable.
Mature (3 - 7 years)	5 - 10%	Multiple teams, feature branches, legacy modules. Active cleanup keeps it under 10%.
Legacy (7+ years)	8 - 15%	Accumulated from team turnover and evolving requirements. Above 15% signals a systematic problem.

Dependency Freshness Benchmarks

The cost of deferred upgrades grows exponentially with the version gap. Upgrading one major version is typically a half-day task. Upgrading three major versions can be a multi-week project with breaking changes compounding across versions.

Current

On latest version

Effort: trivial

1 behind

Previous major

Effort: half-day

2-3 behind

Multiple majors

Effort: 1-2 weeks

4+ behind

Migration crisis

Effort: weeks to months

CVE Response Time Targets

Critical CVEPatch within 24 hours

High CVEPatch within 7 days

Medium CVEPatch within 30 days

Low CVENext scheduled update

DORA Metrics Mapped to Code Quality

DORA (DevOps Research and Assessment) metrics measure software delivery performance. Code quality directly influences all four metrics. Teams with high code quality scores consistently achieve elite DORA performance.

Metric	Elite	High	Medium	Low
Deployment Frequency	Multiple/day	Weekly	Monthly	< Monthly
Lead Time for Changes	< 1 hour	< 1 day	< 1 week	> 1 month
Change Failure Rate	< 5%	5 - 10%	10 - 15%	> 15%
Time to Restore (MTTR)	< 1 hour	< 1 day	< 1 week	> 1 week

Deployment Frequency: High coverage and clean quality gates enable confident, frequent deploys

Lead Time for Changes: Low complexity and good test coverage reduce review and validation time

Change Failure Rate: Coverage gaps and high complexity directly increase change failure rates

Time to Restore (MTTR): Low cognitive complexity and good coverage make incident diagnosis faster

Benchmarks by Language

Different languages have different normal ranges. A cyclomatic complexity of 15 in Go is unusual; in Java enterprise code it is common. Apply language-specific context to your benchmarks.

Java / Kotlin

Typical complexity: 10 - 20Typical coverage: 70 - 85%

Mature tooling ecosystem. SonarQube strongest. Enterprise codebases tend toward higher complexity due to framework overhead.

Python

Typical complexity: 5 - 15Typical coverage: 60 - 75%

Dynamic typing means coverage matters more. Missing type hints compound maintenance cost. Lower complexity norm due to language expressiveness.

TypeScript / JavaScript

Typical complexity: 8 - 18Typical coverage: 55 - 70%

Frontend code tends to be undertested. React component complexity is often higher than it appears. Type coverage matters alongside line coverage.

Go

Typical complexity: 5 - 12Typical coverage: 65 - 80%

Language design enforces simplicity. Error handling inflates line counts. Coverage tooling is built in. Duplication tends to be higher due to explicit error handling.

Rust

Typical complexity: 8 - 15Typical coverage: 60 - 75%

Compiler catches many issues that other languages need tests for. Ownership model reduces certain bug classes. Coverage tooling is less mature.

C# / .NET

Typical complexity: 10 - 22Typical coverage: 65 - 80%

Similar profile to Java. Enterprise codebases can have high complexity from framework abstractions. Strong Roslyn analyser ecosystem.

Calibrate calculator with benchmarks →SonarQube metric definitions →Pick the right tracking tool →