Browser scanner First-class browser-forensics surface for Chromium-based browsers and Firefox.
digger's browser scanner pulls eight artifact families from every Chromium-based profile (Chrome / Edge / Brave / Arc / Vivaldi / Opera) and cross-references every origin against live URLhaus + ThreatFox feeds. Firefox history + extensions are covered too.
What's captured
| Artifact | Source | Privacy |
|---|---|---|
chrome.history | History SQLite | URLs + titles |
chrome.downloads | History SQLite | target_path + source_url |
chrome.extensions | Extensions/ + manifest.json | id / name / version / permissions / update_url |
chrome.cookies | Cookies SQLite | per-domain counts + value bytes only, never values |
chrome.passwords_summary | Login Data SQLite | saved count + distinct realm count only |
chrome.indexeddb | IndexedDB/ subdirs | origin (reverse-mangled from filesystem name) + bytes |
chrome.local_storage | Local Storage/leveldb/ via strings | origin strings + total bytes |
chrome.pwas | Web Applications/ + manifest.json | id / name / start_url / scope |
chrome.profile_defaults | Preferences JSON | default search engine, homepage, startup URLs, Safe Browsing on/off |
chrome.service_workers | Service Worker/Database/ via strings | origins, script count, total storage bytes |
Detector branches
The BrowserDetector runs eight independent checks. Every
origin that appears in any artifact gets cross-referenced against the
URLhaus / ThreatFox live feeds for a known-bad-host hit.
- Extensions — risky permissions (
<all_urls>,tabs,webRequestBlocking,nativeMessaging,debugger, etc.) → medium (T1176) - Cookies — >500 cookie domains → low (T1539); cookies for known-bad host → high
- Saved passwords — >200 saved → info advisory (T1555.003)
- IndexedDB — >200 MB single-origin → low; bad-origin match → critical (T1185)
- Local Storage — bad-origin match → critical (T1185)
- PWAs — inventory (info); bad start_url → critical
- Profile defaults — non-mainstream search engine → medium (search-hijack suspicion); startup URL is bad-host → critical (T1176)
- Service workers — baseline info finding per profile (friendly vs unfamiliar origin split); ≥5 unfamiliar → medium; >500 MB SW storage → medium, >2 GB → high; ≥60 origins → low; bad-origin → critical. See also unpatched Chromium bugs for the crbug-40062121 service-worker persistence story.
Cookie / password counts only — never values
The browser collector deliberately reads only aggregate counts from
the password and cookie databases. The Login Data SQLite is queried
with SELECT COUNT(*), COUNT(DISTINCT signon_realm) FROM logins;
the cookie store with SELECT host_key, COUNT(*), SUM(LENGTH(value))
... GROUP BY host_key. The values themselves never enter the evidence
store, never appear in a finding, never get logged. This is an
intentional ethics-contract design: passwords + cookies are
uniquely high-impact loss surfaces, and a counts-only
inventory delivers ~90% of the investigative value (is there a
saved-credential store at all? for how many sites?) with zero exposure.
Origin reverse-mangling
Chrome stores IndexedDB databases in directories like
https_example.com_0.indexeddb.leveldb. The collector
reverses the underscore mangling to produce a URL-shaped origin
(https://example.com) that detectors can pattern-match
against URLhaus / ThreatFox entries without further parsing.
Live-feed cross-reference
On every scan, the detector pulls the latest URLhaus + ThreatFox
entries from the intel cache and builds a single bad-host set. Every
origin from every browser artifact gets a _matches_bad_origin
check — exact host match OR subdomain rollup (e.g., a finding for
foo.evil.com in browser data with evil.com in the
feed both fire). When a match lands, the finding carries the source
feed identifier so the reviewer can pivot back to the feed entry.
See also
- Unpatched Chromium bugs corpus — the crbug-40062121 service-worker persistence canary and its corpus schema.
- Live feeds — URLhaus, ThreatFox, MalwareBazaar, OpenSSF, SigmaHQ, MITRE ATT&CK, Aikido Shai-Hulud.
- Ethical contract — P10 audit-visibility is why we report findings on digger itself rather than silently filtering, and P2 observation-default is why password values never enter the evidence store.