A CI failure analysis system that remembers prior incidents and routes the next fix faster.

ForgeBeyond parses failed runs, separates the primary failure from downstream noise, checks failure memory, and writes a PR/MR-native explanation with confidence and next action.

working product shape
Problem
Recurring CI failures force engineers to reconstruct context from logs, git blame, Slack, and memory.
System
Deterministic parsers and taxonomy first; bounded AI and memory only when evidence supports it.
Output
PR/MR comment with root cause, owner hint, confidence, prior incident recall, and next action.

The engine starts with artifacts, not vibes.

JUnit XMLpytest outputTAPraw build logsworkflow YAMLgit diffCODEOWNERSprior FMOs

The result lands where engineers already work.

failure classroot-cause hypothesisconfidence scoreprimary anchorsecondary noiselikely ownerprior fix contextnext action

The memory story is tested against cases designed to break it.

The docs preserve misses, reruns, and ambiguity. Current honest claim: memory recall is stable on repeated contract cases and cuts tokens materially; superiority claims wait until evals beat both no-memory and generic baselines.

contract-echo repeat trials 3 trials

memory retrieved 3/3 repeated cases each time

contract-echo token use 532.5 avg

vs 1696.4 avg in no-memory mode

synthetic families 28 cases

recurrence, dependency drift, wrapper noise, cross-repo breakage

release rule fail closed

no “memory moat” claim unless evals beat no-memory and baseline

Four boundaries keep the product credible.

Rules first

Known failure patterns classify deterministically from evidence packets before any model call.

Memory as context

Failure Memory Objects supply prior incidents and fixes, but do not override present-run evidence.

Confidence is visible

Every result carries evidence strength, pattern match, signal completeness, and classification clarity.

Public corpus is sanitized

The open-source path stores normalized memories, provenance, and fix summaries rather than raw private logs.