1. Study Design
We selected 40 open-source repositories spanning five technology stacks (Python/Django, Node.js/Express, Go, Java/Spring Boot, and Terraform/AWS) across three size categories: small (<10K lines of code), medium (10K–100K LOC), and large (>100K LOC). Each repository was independently audited by a two-person team of certified security engineers (OSCP/CISSP credentialed) using industry-standard manual audit procedures over a fixed engagement window of 8 working hours per repository.
Recon was run against the same 40 repositories in its standard configuration - no prompt tuning, no repository-specific optimization - and time-to-report was measured from scan initiation to final report delivery.
Findings from each method were then reconciled by a third independent panel to produce a ground-truth finding list per repository. True positives, false positives, and false negatives were computed for each method against this ground truth.
2. OWASP Top 10 Detection Results
| OWASP Category | Recon Detection Rate | Manual Detection Rate | Recon False Positive Rate |
|---|---|---|---|
| A01 Broken Access Control | 96% | 94% | 4.2% |
| A02 Cryptographic Failures | 98% | 97% | 2.1% |
| A03 Injection | 99% | 98% | 1.8% |
| A04 Insecure Design | 71% | 89% | 18.6% |
| A05 Security Misconfiguration | 97% | 93% | 3.4% |
| A06 Vulnerable Components | 100% | 96% | 0.9% |
| A07 Auth Failures | 95% | 92% | 5.1% |
| A08 Software Integrity Failures | 88% | 85% | 7.3% |
| A09 Security Logging Failures | 91% | 87% | 6.2% |
| A10 SSRF | 93% | 90% | 4.8% |
Recon outperforms manual auditing on eight of ten OWASP categories. The two where manual auditors perform better are A04 Insecure Design (which requires contextual business logic understanding that current static analysis cannot fully replicate) and, marginally, A02 and A03 at the boundary of complex multi-file data flow tracing.
3. Secret Detection: Where Recon Excels
Secret detection - finding hardcoded credentials, API keys, tokens, and private keys in code and git history - was the category of largest performance delta between methods.
Manual auditors searched current branch code and recent commit history (typically 90 days). Recon performs full git history traversal including packed objects and reflog entries. In the 40-repository dataset, Recon found 247 secret exposures that manual auditors missed - the majority in commits older than 6 months that had been "deleted" via standard git operations but remained accessible in the repository object store.
Recon's false positive rate on secret detection was 3.1% - driven primarily by test fixtures containing intentionally fake credentials with realistic formatting. We are addressing this with context-aware classification that detects test environment markers and fixture patterns.
4. Compliance Finding Accuracy
For compliance checks (SOC 2, GDPR, HIPAA), we restricted evaluation to the 14 repositories that had documented compliance requirements - seven healthcare-related (HIPAA) and seven SaaS products (SOC 2). GDPR evaluation was performed on all 40 repositories.
Recon achieved 88% accuracy on HIPAA control mapping (manual: 84%), 92% on SOC 2 (manual: 89%), and 79% on GDPR (manual: 82%). GDPR performance is lower for both methods due to the interpretation complexity of GDPR's risk-based approach - regulatory controls cannot always be mapped deterministically to code-level assertions.
5. Time-to-Report Comparison
| Repository Size | Recon Scan Time | Manual Audit Time | Speed Factor |
|---|---|---|---|
| Small (<10K LOC) | 4.2 min avg | 4 hrs | 57× |
| Medium (10–100K LOC) | 11.8 min avg | 8 hrs | 41× |
| Large (>100K LOC) | 31.4 min avg | 16–24 hrs | 35–46× |
6. Limitations and Conclusion
Recon is not a replacement for human security expertise - particularly for business logic vulnerabilities (A04), threat modeling, and penetration testing. What it provides is comprehensive, rapid, repeatable baseline coverage that eliminates the low-hanging fruit before human experts engage, making their time more valuable.
We recommend a combined model: Recon on every PR merge, manual penetration testing quarterly.