Field notes

The work, written down as we ship it.

Methodology essays, architecture decisions, validation deep-dives, and the regulatory context behind the design choices. Curated by the MolTrace team — written for analysts, engineers, and regulatory reviewers who want the actual reasoning.

Subscribe About MolTrace

Featured essay

MethodologyForthcoming

Why we count chemical environments, not peaks

The expert-reference vs detector-output mismatch that nearly broke our promotion gate — and the clustering layer that fixed it.

NMRShiftDB2 references count distinct chemical environments; detectors faithfully resolve multiplet lines. Median peak-count deltas of 17 looked like an algorithm failure; they were a units mismatch. Field notes from the Phase 10 multiplet-clustering work.

2026-05-279 min readMolTrace research team

Editorial streams

Three streams, one editorial standard.

We publish across science, engineering, and methodology — each stream has its own audience but shares the same rigor.

Science
Methodology essays, validation deep-dives, and notes from the analytical team.
Engineering
Architecture decisions, contract design, perf wins, and the instrumentation under the hood.
Methodology
How we measure ourselves. Promotion gates, regression-corpus design, and what 'experimental' really means.

EngineeringForthcoming

A regression test that fails by fixture_id

How a 20-fixture A/B JSON sidecar replaced our 'looks-good-to-me' detector reviews.

Every detector change runs against a curated NMRShiftDB2 corpus before merge. CI fails by name when any single fixture drifts >50% — so reviewers see 'nmrshiftdb2_60000006_13c regressed' instead of 'tests passed (with notes).' The boring infrastructure that quietly raised our ship velocity.

2026-05-277 min read

Subscribe for drop

MethodologyForthcoming

What 'experimental' actually means in our promotion gate

Every new AI backend ships as opt-in. Promotion to default is a published-threshold decision, not a vibes call.

GSD-Prompt-3 shipped as `experimental: true` with a documented gate (95% solvent detect, median compound-count delta ≤2). Until both clear, the default stays legacy. Most AI startups ship and patch; we publish the corpus, the threshold, and the date a feature crosses each one.

2026-05-276 min read

Subscribe for drop

RegulatoryForthcoming

No confidence number without an audit trail

Why we'd rather show 'pending' than a polished score with no provenance.

Every numerical claim in the UI links to its source — the spectrum file, the picked peaks, the SMILES candidate, the literature citation, the human reviewer who signed off. The implementation cost is real. The regulatory cost of doing it otherwise is higher.

2026-05-218 min read

Subscribe for drop

EngineeringForthcoming

From Bruker SFO1 to GSD: plumbing instrument metadata through the contract

A 500-MHz field hardcoded in the FE became a real number from the vendor metadata. Three lines of code, one cascade, no contract change.

Phase 8 traced field_mhz through the preview → process → analyze chain so the GSD endpoint receives the spectrometer frequency the instrument actually used (600.13 MHz, in our verification fixture) instead of a hardcoded 500. The same plumbing pattern works for vendor / solvent / nucleus.

2026-05-275 min read

Subscribe for drop

EngineeringForthcoming

Why legacy's fit χ² of 10¹⁵ is honest

Per-peak QC metrics landed on legacy peaks and immediately surfaced a units mismatch. We shipped the column anyway.

GSD reports fit residuals normalized to baseline σ; legacy reports them in raw signal-domain units. The same threshold paints 31/37 peaks 'red' on legacy spectra. The right fix is detector-side normalization — but in the meantime, the column tells the truth.

2026-05-286 min read

Subscribe for drop

ScienceForthcoming

Validation against references that count the way detectors count

NMRShiftDB2 said the algorithm was failing. HMDB-style references said it was clearing the strict gate. Both were right.

Same algorithm, two corpora, two verdicts. The Phase 14 framework added expert-curated multiplet-line references so we could finally separate detector quality from corpus granularity. Strict gate cleared at multiplet-line scale; NMRShiftDB2 environment-scale stays xfailed by design.

2026-05-2710 min read

Subscribe for drop

EngineeringForthcoming

Additive, never destructive — across 39 evidence layers

Every existing endpoint and regression test must stay green as new layers land. Here's how the typed-Pydantic contract makes that affordable.

Layer 22 (proton/carbon-13 scoring) and Layer 39 (LCMS feature grouping) speak the same API style. Stable JSON keys, additive fields, openapi-typescript regen on every contract change. The 'never overwrite a prior layer' rule is what lets us ship weekly without breaking last year's dossier.

2026-05-1512 min read

Subscribe for drop

RegulatoryForthcoming

Reading the FDA's January 2025 AI framework, in code

Stage-4 human oversight gates aren't a paragraph in a policy; they're a release queue in your audit table.

The FDA's 2025 framework formalizes risk-based credibility for AI in regulatory submissions. We mapped each stage onto concrete code: model-card registry, recipe-hash provenance, human-signoff queue, immutable raw vault. The PRs are linkable; the audit ledger is queryable.

2026-05-1011 min read

Subscribe for drop

The work, written down as we ship it.

Why we count chemical environments, not peaks

Three streams, one editorial standard.

A regression test that fails by fixture_id

What 'experimental' actually means in our promotion gate

No confidence number without an audit trail

From Bruker SFO1 to GSD: plumbing instrument metadata through the contract

Why legacy's fit χ² of 10¹⁵ is honest

Validation against references that count the way detectors count

Additive, never destructive — across 39 evidence layers

Reading the FDA's January 2025 AI framework, in code

Get each essay as it drops.