What’s actually inside,
on the label.
Market Scholar is a machine-learning forensic engine with its own AI credibility model. It’s been validated against the academic theory of how market narratives move prices — and forward-tested in live trading books. Here are the results, with the caveats attached. Every number on this page is reproducible from the system’s own database.
Two layers. Kept honest and kept apart.
Most “AI for markets” blurs measurement and money until you can’t tell what was actually proven. We don’t. Market Scholar is built as two distinct layers, and we report each on its own terms.
The measurement layer
The forensic engine that scores every company: credibility, filing drift, coordination, decay, fair-value divergence. This is the part that’s been written up as academic working papers and validated against the data — it predicts the structure of a narrative, with effect sizes we report in full.
The trading layer
A separate, signal-driven book — named strategies, each with its own direction and a walk-forward edge grade, sized by realized performance. It does not trade the forensic verdicts directly. This is where the paper P&L is earned, and we judge it on book P&L, not story.
The discipline behind every figure: walk-forward training, time-ordered splits, and observation-time-only features — nothing the model couldn’t have known at the moment it scored.
It runs on language models. It doesn’t think like one.
Off-the-shelf models (Gemini, Claude) do the reading — parsing claims out of filings and articles. But the judgment is ours: a proprietary Narrative Credibility Classifier, retrained nightly on the system’s own outcome-labeled data.
- 34 observation-time features — speaker authority, market regime, narrative phase, sector momentum
- Trained on 67,941 outcome-derived labels, walk-forward with a 30-day buffer
- On the one task that matters — “will the market validate this story?” — it beats general-purpose LLMs
The textbook said stories move markets. We measured it.
Robert Shiller won the Nobel and built “Narrative Economics” on four claims — that narratives are quantifiable, decay over time, spread by contagion, and create mispricings that revert. Prior work tested them with Google Trends and word counts. We tested them at the granularity of one company on one day, and found support for all four.
Narratives are quantifiable
187 companies reduced to ~30 daily forensic features each — coordination, filing-drift, decay, credibility — across 12,747 daily classifications and 287,482 narrative observations.
Narratives have measurable lifecycles that decay
Stories in their “dying” phase (5–20% energy remaining) returned +4.01% over 5 days vs +1.18% for full-energy stories (n = 531 vs 6,170). Decay rate predicts returns at r = +0.072.
Narratives spread by contagion
Coordinated, high-drift coverage continued to drift +3.36% over 5 days vs +1.26% control. Themes lead each other on a clock: Edge-AI precedes Quantum by ~5 days (r = +0.61).
Narrative dynamics produce mispricings that revert
Exhausted narratives on moderately undervalued names returned +4.19% (61.7% up); on moderately overvalued names, +0.03% (43.4% up) — a +4.16-point reversion-to-fundamentals gap.
Seven independent findings
Each survives walk-forward, observation-time-only testing across the 17-month validation panel. Effect sizes for the novel narrative findings run r = 0.07–0.14.
Source: Walsh (2026), “Forensic Narrative Classification and Equity Returns,” Market Prism Working Paper, SSRN preprint. Validation dataset: 187 US equities, 17 months, 287,482 narrative observations, five calibration regimes.
Forward-tested, with money on the line.
The trading layer runs as live paper books mirrored to brokerage paper accounts — real fills, real slippage, no benefit of hindsight. Two books, two months, both net positive: every trade was opened after the model said so.
$300K book
100% forward · no backtest$50K book
Forward / liveThe caveats are part of the credibility.
A forensic engine that won’t name its own limits isn’t forensic. Here’s what the evidence does not support — stated by us, before anyone else has to.
Measurement, not magic.
Effect sizes for the novel narrative findings run r = 0.07–0.14 (R² of 0.5–2%) — squarely in the range of credible behavioral-finance research, not “physics-grade” prediction.
Some raw spreads are just beta.
In a separate beta-adjusted audit, four raw narrative-state return spreads (coordinated campaigns, rapid decay, energy transitions, high suspicion) shrink to statistical noise once market beta is removed. We publish that correction ourselves.
One market regime.
The validation window (Dec 2024 – May 2026) is a single mixed bull/choppy regime. Effects could attenuate — or invert — in an extended bear market. We say so in the paper.
Forensics ≠ a recommendation.
The scorecard measures how a story holds up against the record. The live trading book is a separate, signal-driven layer. We never sell the verdict as the trade.
Three patents pending on the methods that make it work
Multi-dimensional analyst & narrative credibility assessment
The credibility classifier — scoring who said what, in which regime, at which phase of which story.
Narrative lifecycle tracking with decay monitoring
The energy-and-decay model: fitting each narrative its own half-life and exhaustion point.
Inference-time temporal methodology
The third filing in the family covering the underlying forensic methodology.
Every test — including null results and findings that later failed — is written to a timestamped scientific audit log. The working papers are posted as SSRN preprints with their methodology, sample sizes, and limitations in full.
- Walsh (2026) — Forensic Narrative Classification and Equity Returns
- Walsh (2026) — Narrative Lifecycle States as Attention & Risk-Loading Regimes
- JEL classification G12, G14, G17, G40
See the system read a live company
Run a forensic audit on any ticker, or talk to us about deploying Market Scholar across research, compliance, and media intelligence.