Analytic tradecraft ICD 203, NATO Admiralty, TLP, and Heuer's Analysis of Competing Hypotheses — wired into the AI triage schema and the case report.
Why this matters
An LLM left unconstrained will produce confidently-worded analysis with no calibrated uncertainty, no separation of source from inference, and no explicit assumptions. The IC analytic standards exist exactly to defend against those failure modes. digger pins them into the triage prompt and the output schema, and the report renderer surfaces them.
ICD 203 — estimative probability
The seven-step IC ladder maps natural-language probability claims to numeric ranges. From ODNI Tradecraft Primer / Sherman Kent's work:
| Phrase | Range |
|---|---|
| almost no chance | 0.01 – 0.05 |
| very unlikely | 0.05 – 0.20 |
| unlikely | 0.20 – 0.45 |
| roughly even chance | 0.45 – 0.55 |
| likely | 0.55 – 0.80 |
| very likely | 0.80 – 0.95 |
| almost certain | 0.95 – 0.99 |
The triage schema requires estimative_probability to be one
of these seven exact strings. Intermediate phrases ("probably", "maybe",
"highly likely") are not permitted — they invite false precision.
Separately, analytic_confidence grades the analyst's
confidence in the judgment itself, on a three-step ladder:
low · moderate · high. Probability
and confidence are independent dimensions.
# In code
from digger.tradecraft import (
ESTIMATIVE_PROBABILITY, ESTIMATIVE_RANGES, validate_judgment,
)
from digger.tradecraft.icd203 import label_for_probability
label_for_probability(0.7) # → "likely"
validate_judgment("probably") # → False
ESTIMATIVE_RANGES["very likely"] # → (0.80, 0.95)
NATO Admiralty — source & information reliability
STANAG 2511. A two-dimensional rating used across NATO, Five Eyes, and most Western intelligence services. Source reliability grades the source; information credibility grades the piece of information.
Source reliability
| Code | Meaning |
|---|---|
| A | Completely reliable — history of complete reliability |
| B | Usually reliable |
| C | Fairly reliable |
| D | Not usually reliable |
| E | Unreliable |
| F | Reliability cannot be judged |
Information credibility
| Code | Meaning |
|---|---|
| 1 | Confirmed by other sources |
| 2 | Probably true — consistent with other information |
| 3 | Possibly true — reasonably consistent |
| 4 | Doubtful — not consistent |
| 5 | Improbable — contradicted |
| 6 | Truth cannot be judged |
For digger findings, the source is one or more collectors. First-party deterministic collectors (processes, network, system, registry, launchd, systemd) default to source-reliability A. The information credibility depends on whether the artifact corroborates other artifacts.
from digger.tradecraft import rate_source, rate_info
from digger.tradecraft.admiralty import derive_for_collector
derive_for_collector("processes") # → "A"
rate_source("B") # → "Usually reliable — ..."
rate_info("1") # → "Confirmed by other sources — ..."
Traffic Light Protocol (TLP 2.0)
The FIRST.org sharing-control standard used by CISA, ENISA, FIRST CSIRTs, and ISACs.
| Marking | Disclosure |
|---|---|
| TLP:CLEAR | Unrestricted disclosure |
| TLP:GREEN | Limited to peer and partner organizations |
| TLP:AMBER | Recipient organization and clients on need-to-know |
| TLP:AMBER+STRICT | Recipient organization only |
| TLP:RED | Named recipients only, no further sharing |
The triage schema requires every finding to carry a TLP marking. Exports to STIX, MISP, and TAXII attach the marking to every produced object and filter by sharing level when requested:
from digger.tradecraft import TLP, can_share, apply_tlp_filter
can_share(TLP.RED, TLP.AMBER) # → False (can't share red at amber level)
can_share(TLP.GREEN, TLP.AMBER) # → True
apply_tlp_filter(findings, TLP.GREEN) # only findings marked CLEAR/GREEN
Heuer's Analysis of Competing Hypotheses (ACH)
Structured Analytic Technique #5 in the ODNI catalog. Defends against confirmation bias by enumerating competing hypotheses, listing evidence, and grading each (hypothesis, evidence) pair as Consistent / Inconsistent / Not Applicable.
The "winning" hypothesis is the one that minimizes inconsistencies — not the one with the most consistencies (which would bias toward whichever hypothesis is the easiest to confirm).
from digger.tradecraft import build_matrix
m = build_matrix(
hypotheses=["H1: legitimate dev tooling",
"H2: targeted post-exploitation"],
evidence=["E1: cron job in /etc/cron.d",
"E2: connections to webhook.site",
"E3: binary signed by JetBrains"],
ratings=[
["C", "C"], # E1 consistent with both
["I", "C"], # E2 inconsistent with H1
["C", "I"], # E3 inconsistent with H2
],
)
m.inconsistency_scores() # → [1, 1] (tied)
m.winning_hypothesis() # → 0 or 1 (tie broken by most-consistent)
The AI triage prompt asks the model to volunteer at least two competing
hypotheses for every finding. They're stored in the
alternative_hypotheses field and rendered in the report.
How they compose in the report
Each triaged finding in the HTML report shows:
- Severity pill + IC verdict (false_positive .. confirmed_malicious)
- Estimative probability label + analytic confidence
- Source reliability (A-F) and information credibility (1-6)
- TLP marking
- One-line summary + rationale + assumptions + alternative hypotheses
- Next steps and IOCs
The case-wide executive summary aggregates these into one overall probability, one overall confidence, key judgments, and attribution hint when signals converge.