1. Why "Deleted" Secrets Are Not Deleted
Git is a content-addressable storage system. When you commit a file containing a secret and then delete it in a subsequent commit, the deletion creates a new commit that no longer references the file - but the original file content still exists as a blob object in the repository's object store, referenced by the SHA-1 hash of the original commit.
Anyone with read access to the repository can retrieve this object directly:
git show 3f4a9b2:config/secrets.env # Output: the full file contents, including the secret, intact
The secret is not gone. It is simply no longer on the default navigation path. But it is fully accessible. If the repository is public or has been cloned by an unauthorized party at any point between the initial commit and the deletion commit, the secret is compromised - permanently and retroactively.
This matters enormously in practice. In our benchmark study of 40 repositories, 73% of secret exposures found by Recon were in commits older than 90 days - well outside the scanning window of most tools that check "recent history."
2. The Full Scope of the Problem
Standard git history includes commits reachable from any branch or tag. But there are additional storage locations in a git repository that contain historical objects and are frequently overlooked by scanning tools:
- Packed objects: Git periodically runs garbage collection, which packs loose objects into compressed pack files. Secrets in old loose objects may be re-encoded into pack files but remain fully accessible via
git unpack-objectsor direct pack file parsing. - The reflog: Git maintains a reflog - a log of every position each branch has pointed to over the past 90 days (by default). Force-pushed commits that appear to have been "overwritten" are still accessible via reflog entries until they expire.
- Squash merges: When a feature branch is squash-merged, the individual commits from the feature branch may no longer appear in the main branch's history - but they remain in the repository as unreachable objects until garbage collection removes them (which may never happen in some configurations).
- Stashed changes:
git stashcreates commits referenced by a special ref. Stashes containing secrets can persist indefinitely even if the developer believes they have been "cleaned up."
3. Recon's Traversal Strategy
Recon's git history scanner operates in three passes:
Pass 1: Full Reachable History
Starting from all refs (branches, tags, HEAD), Recon performs a complete traversal of the reachable commit graph. For each commit, it checks all modified files (not just adds - a file modification could introduce a secret without adding a new file). This pass uses a streaming approach to avoid loading the full history into memory on large repositories.
Pass 2: Packed Object Inspection
Recon reads the repository's pack index files (.git/objects/pack/*.idx) to enumerate all packed objects. It then reads all blob-type objects that do not appear in the reachable history (unreachable blobs), filtering for those that contain secret-pattern matches. This catches secrets in objects that have been "garbage collected" from loose object storage but not from pack files.
Pass 3: Reflog and Special Refs
Recon reads all entries from all reflogs (.git/logs/refs/), extracting commit SHAs that are not reachable from current refs. These are checked for secret exposure. Stash refs (refs/stash) are also included in this pass.
4. Secret Pattern Detection
Recon uses a multi-layer pattern detection approach. The first layer applies a library of regular expressions for known secret formats: AWS access key IDs and secret access keys, GitHub personal access tokens (including both classic ghp_ and fine-grained github_pat_ formats), Stripe API keys, Twilio auth tokens, private key PEM blocks, and 40+ additional formats covering major SaaS and cloud providers.
The second layer applies entropy analysis to unstructured strings - detecting high-entropy character sequences that may be secrets but don't match known patterns. This catches custom-format API keys and internal secrets. The entropy threshold is calibrated to minimize false positives on legitimate high-entropy content like base64-encoded public keys and UUIDs.
The third layer applies context scoring - reducing the confidence score for strings in files identified as test fixtures, mock data, or documentation (detected by file path patterns and surrounding context). This is where most false positives are suppressed.
5. Remediation Guidance
When Recon finds a secret, it always accompanies the finding with a remediation priority assessment. Secrets in the current HEAD require immediate rotation. Secrets that exist only in unreachable history require rotation but the urgency depends on whether the repository has public visibility or has been cloned externally. Recon integrates with major secret management platforms to provide one-click rotation flows for supported provider credentials.
The only permanent fix for a secret in git history is rotation - the credential itself must be invalidated and replaced. No amount of commit history rewriting reliably eliminates the exposure once a repository has been cloned or its objects accessed externally.
Recon's deep history scanning integrates seamlessly with CI/CD pipelines. The scanner runs on every push and fails the build if new secrets are detected. This prevents secrets from entering the repository in the first place.
9. Conclusion
Shallow secret scanning is insufficient. Secrets committed and "deleted" years ago remain fully accessible in git history. Recon's three-pass scanning strategy finds these hidden secrets and provides actionable remediation guidance.