Project active

ATOM

Verified reasoning through multi-model consensus

The bottleneck in AI-assisted reasoning isn't knowledge retrieval but navigation: knowing which claims to trust, when to stop decomposing, and how to surface genuine disagreement between models.

Approach

ATOM breaks complex questions into atomic, independently verifiable claims. Each atom is probed by a diverse panel of small language models. Rather than relying on a single large model's confidence, it triangulates across three diverse 1B-parameter models, treating consensus and disagreement as first-class signals. The system learns when to stop decomposing through a learned stopping heuristic (FRSM), discovering that most reasoning chains stop one hop too early.

What's novel

Atomic decomposition

Questions are recursively split into the smallest independently verifiable claims, creating a reasoning tree where each leaf can be checked in isolation.

ii.

Multi-model consensus

Three diverse 1B-parameter models outperform a single 8B model. Disagreement between models is a feature, not a bug, surfacing genuine ambiguity in the evidence.

iii.

77% one-hop-short finding

Analysis of 14,502 classified predictions revealed that 77% of reasoning failures occur because the chain stops one decomposition step too early.

iv.

FRSM substrate direction

A learned stopping heuristic that predicts when further decomposition will yield diminishing returns, moving toward substrate-independent reasoning evaluation.

Stack

FastAPIReactPyTorchMLXOpenRouter

Built for local-first inference with MLX on Apple Silicon, with OpenRouter fallback for model diversity.

Scale

983 tests

14,502 classified predictions

77% stop 1 hop short

Status

Active development. Core decomposition and consensus pipeline complete. Currently refining the FRSM stopping heuristic and building the evaluation harness.

GitHub