Codebase Size Metrics

file_count and total_loc — Track growth and structure over time

File Count

file_count is the total number of source files analyzed during a scan. The scanner discovers all files matching supported languages (JavaScript, TypeScript, Python, Java, PHP, C#) and excludes directories like node_modules, dist, and other build output.

Growth Tracking

A steadily increasing file count is normal for active projects. Sudden jumps may indicate generated code, copy-paste proliferation, or a large dependency being vendored into the repository.

Scan Limits

Hobby plans support up to 5,000 files per scan. Pro plans support up to 50,000 files. If your repository exceeds the limit, the scan will fail and you will need to upgrade or configure exclusions.

Total Lines of Code

total_loc is the sum of logical lines of code across all functions in the repository. Unlike raw line counts, this metric counts executable statements and excludes comments, blank lines, and import declarations. It reflects the actual volume of logic in your codebase.

Calculation

total_loc = sum(logical lines across all functions)

Each function's logical LOC is computed by the tree-sitter parser during scanning. The total is the sum across every function, method, and class in every scanned file.

Why They Matter

Correlate Size with Complexity

If total_loc grows faster than file_count, your files are getting longer and individual functions are accumulating more logic. This often precedes a rise in complexity_avg.

Measure Decomposition

A high file_count relative to total_loc indicates good decomposition — many small files with focused responsibilities. The inverse suggests monolithic files that are harder to maintain.

Detect Anomalies

Sudden drops in file_count or total_loc may indicate accidental deletions or a major refactoring. Sudden spikes may indicate generated code or vendored dependencies. Both are worth investigating.

Interpreting Values

PatternMeaningAction
file_count and total_loc grow steadilyHealthy growth — new features being addedMonitor complexity metrics for quality
total_loc rises but file_count flatFiles getting larger — code being added to existing filesConsider splitting large files
file_count rises but total_loc flatCode is being refactored into smaller filesGood sign — verify complexity is improving
Sudden spike in either metricGenerated code, vendored deps, or bulk importInvestigate; exclude generated files if appropriate

Average LOC Per File

avg LOC per file = total_loc / file_count

This derived ratio is a useful indicator of file size. Track it over time to see whether your codebase is trending toward smaller, more focused files or larger, monolithic ones.

Avg LOC/FileAssessment
100Well-decomposed — small, focused files
101 - 300Typical for most codebases
301 - 500Large — some files may need splitting
> 500Very large — likely monolithic files that should be refactored

Setting Alerts

While absolute values for file_count and total_loc vary widely between projects, you can set threshold alerts to catch unexpected changes:

  • file_count spike — Alert if file count increases by more than 20% between scans to catch accidental vendoring or generated code
  • total_loc spike — Alert if LOC increases by more than 30% between scans to catch bulk additions that may need review
  • Approaching limits — Alert when file_count nears your plan's limit (5,000 for Hobby, 50,000 for Pro) to avoid scan failures

Related Metrics

  • complexity_avg — Rising LOC per file often precedes rising average complexity
  • symbol_count — More symbols with stable file count indicates good decomposition
  • complexity_p95 — Large files tend to contain the highest-complexity functions