digger
A cross-platform, forensics-grade endpoint investigation suite. Collects artifacts from Windows, macOS, and Linux into a tamper-evident evidence store, runs a stack of behavioral + intel-driven detectors, triages findings with a local LLM under IC analytic-tradecraft standards, and produces NIST/ISO/CMMC-aware reports.
What it does, in one screen
1. Collect
~30 platform-aware collectors pull processes, network state, persistence, browsers, logs, services, packages — into a hash-chained SQLite store.
2. Detect
32 detectors: suspicious processes, persistence outliers, LOLBins, YARA/IOC/Sigma, C2 frameworks, Shai-Hulud, supply-chain (live KEV), named threat actors, browser, env hijacks, service-version CVE, firewall audit, plus 12 Decepticon countermeasures covering recon → exploitation → privesc → lateral → AD attacks → cloud → counter-RE → persistent sessions → attacker tooling.
3. Continuously poll intel
15 feeds (CISA KEV, abuse.ch URLhaus/ThreatFox/MalwareBazaar, Tor exit, Spamhaus, OpenSSF malicious-packages, Shai-Hulud, GitHub Advisory DB, NVD CPE-keyed CVEs, SigmaHQ rule corpus, MITRE ATT&CK STIX) with per-feed cadences and conditional fetches. Live-first convention statically enforced.
4. AI triage
OpenAI-compatible local LLM (llama.cpp / ollama / vllm). Output is schema-enforced under ICD 203 + NATO Admiralty + TLP — estimative probability, analytic confidence, source reliability, IOCs, MITRE.
5. Post-quantum sign
liboqs-backed PQC. Every NIST PQC algorithm — FIPS-finalized (ML-KEM, ML-DSA, SLH-DSA, Falcon), Round 4 (HQC, BIKE, McEliece), and signature on-ramp (CROSS, MAYO, SNOVA, SQIsign, …). Hybrid encryption with AES-256-GCM.
6. Compliance & report
Map findings to 18 frameworks (NIST 800-53/171, CSF 2.0, CMMC, FedRAMP High, ICD 503, ISO 27001/27037, CIS, DISA STIG, PCI-DSS, HIPAA, GDPR, NIS 2, Essential 8, SOC 2). Export to STIX 2.1 / MISP / ATT&CK Navigator / TAXII.
Five-minute orientation
# install
pip install -e ".[all]"
# run a full investigation: collect + scan + triage + report
digger investigate --case-dir ./case-1 --report ./case-1/report.html
# launch these docs
./docs.sh
Principles
- Evidence first
- Every component reads or writes a single, hash-chained SQLite evidence store. Nothing communicates out-of-band. Append-only. Verifiable. Signable with PQC.
- Local by default
- No cloud calls. Intel feeds are explicit GETs the user opts into. LLM triage runs on a process the user owns.
- Algorithm-agile
- PQC algorithms come from whatever liboqs exposes at runtime, not a hard-coded list. New NIST candidates are picked up by upgrading liboqs.
- Graceful degradation
- A collector that can't read a file, a detector that needs a missing Python package, an LLM that's offline — all log and move on. The tool never aborts a case because of a partial environment.
- Forensically sound
- Paired SHA-256 + SHA3-256 hash chain across artifacts and findings. ISO/IEC 27037 + NIST SP 800-86 chain-of-custody record alongside the DB. Optional PQC signature over the chain tip. TLP markings on every finding.
Where to next
Getting started →
Install, first case, llama.cpp setup, common flags.
Architecture →
How the pipeline fits together, what each module is responsible for.
Extending →
Writing a new collector, detector, compliance framework, or intel feed.
Forensics-grade →
Chain of custody, hash chain, PQC signing, evidence handling.