Browser scanner First-class browser-forensics surface for Chromium-based browsers and Firefox.

digger's browser scanner pulls eight artifact families from every Chromium-based profile (Chrome / Edge / Brave / Arc / Vivaldi / Opera) and cross-references every origin against live URLhaus + ThreatFox feeds. Firefox history + extensions are covered too.

What's captured

ArtifactSourcePrivacy
chrome.historyHistory SQLiteURLs + titles
chrome.downloadsHistory SQLitetarget_path + source_url
chrome.extensionsExtensions/ + manifest.jsonid / name / version / permissions / update_url
chrome.cookiesCookies SQLiteper-domain counts + value bytes only, never values
chrome.passwords_summaryLogin Data SQLitesaved count + distinct realm count only
chrome.indexeddbIndexedDB/ subdirsorigin (reverse-mangled from filesystem name) + bytes
chrome.local_storageLocal Storage/leveldb/ via stringsorigin strings + total bytes
chrome.pwasWeb Applications/ + manifest.jsonid / name / start_url / scope
chrome.profile_defaultsPreferences JSONdefault search engine, homepage, startup URLs, Safe Browsing on/off
chrome.service_workersService Worker/Database/ via stringsorigins, script count, total storage bytes

Detector branches

The BrowserDetector runs eight independent checks. Every origin that appears in any artifact gets cross-referenced against the URLhaus / ThreatFox live feeds for a known-bad-host hit.

Cookie / password counts only — never values

The browser collector deliberately reads only aggregate counts from the password and cookie databases. The Login Data SQLite is queried with SELECT COUNT(*), COUNT(DISTINCT signon_realm) FROM logins; the cookie store with SELECT host_key, COUNT(*), SUM(LENGTH(value)) ... GROUP BY host_key. The values themselves never enter the evidence store, never appear in a finding, never get logged. This is an intentional ethics-contract design: passwords + cookies are uniquely high-impact loss surfaces, and a counts-only inventory delivers ~90% of the investigative value (is there a saved-credential store at all? for how many sites?) with zero exposure.

Origin reverse-mangling

Chrome stores IndexedDB databases in directories like https_example.com_0.indexeddb.leveldb. The collector reverses the underscore mangling to produce a URL-shaped origin (https://example.com) that detectors can pattern-match against URLhaus / ThreatFox entries without further parsing.

Live-feed cross-reference

On every scan, the detector pulls the latest URLhaus + ThreatFox entries from the intel cache and builds a single bad-host set. Every origin from every browser artifact gets a _matches_bad_origin check — exact host match OR subdomain rollup (e.g., a finding for foo.evil.com in browser data with evil.com in the feed both fire). When a match lands, the finding carries the source feed identifier so the reviewer can pivot back to the feed entry.

See also