AXL / RESEARCH LOG

AXL Research Log

A per-commit record of the dual-agent research iteration, generated deterministically from the git history at HEAD 91dbceb on 2026-04-21T17:38:16+00:00.

How this was generated

This page is the output of tools/build-research-log.py v1.0, a deterministic script that walks the repo's git history and classifies each commit by role using regex rules that match this repo's authoring conventions. It is not a hand-curated narrative. Running the same script against the same git HEAD produces the same output.

Assumptions this property depends on:

Commit message discipline. The DAG of review-and-response edges is reconstructed from explicit SHA references in commit messages (patterns like "Codex sha review", "Codex sha follow-up", "Codex review of sha"). If future commits stop naming the SHA they respond to, the DAG loses edges. This is a commit-hygiene constraint, not a script bug.
Repo-scoped regex classifier. Role classification depends on this repo's authoring conventions: codex r\d+, Gap \d+, ship:, bench:, spec:, docs:, RESULT. Applying this script to another repo with different conventions produces meaningless classifications.
Metric extraction is best-effort. Numeric transitions in commit bodies are captured when they match known patterns (NN.NN% -> NN.NN%, NNN tests pass, Δrecall ±NN.NN). Claims stated in other forms may not surface as structured metrics. The subject and body text are always preserved verbatim in the JSON so human readers can verify.

Known classifier limitations on this specific history:

The R4.1 sub-finding (referenced in the R4 commit body 099dbe6) does not appear as its own round-labeled entry because it does not have its own commit. Extraction picks the first round match in a commit, so R4 wins. To surface R4.1 as an independent entry, a future sub-finding would need its own commit with R4.1 in the subject.
Body-level mentions of round labels occasionally produce false-positive round assignments on commits that reference a round in narrative explanation without BEING that round. Verifiable by reading the subject in the entry below.
The long-form phrasing "round 1" in commit 0a5cad4 is caught by a dedicated pattern; other long-form phrasings of later rounds (if any) may not be caught.

Inputs: git log --reverse --format=... against this repo. Outputs: research-log.json (machine-readable, complete) and this HTML page.

Summary

Role	Commits
`bench`	13
`claude-research-impl`	9
`spec`	7
`codex-review-round`	7
`docs`	7
`codex-review-response`	7
`gate-kit`	6
`substrate-gap`	4
`corpus-result`	4
`ship`	1

Total commits: 65. Response edges (commits that name a target SHA): 6. Commits with a formal review-round label: 7.

Commit log

`bae849f` gate-kit 2026-04-10

Seed: dual-agent research instructions for AXL Rosetta v4

`d2b81b5` claude-research-impl 2026-04-10

impl: complete v4 reference implementation and test harness

`0495e1e` claude-research-impl 2026-04-10

chore: add .gitignore

`d459044` spec 2026-04-11

spec: restructure v4 as normative kernel + classified layers

`2b5aaa7` codex-review-round 2026-04-11

spec: drop lossless/lossy split, code layer is lossy IR

Round R2

`ffe73c8` claude-research-impl 2026-04-11

chore: expand gitignore to exclude system and tool directories

`516ea94` spec 2026-04-11

spec: grammar boundary rewrite, kernel ≤80, evidence schema closure

`cf35feb` spec 2026-04-11

spec: R4 hardening, error taxonomy, canonical serializer, evidence backlinks

`3bd5bef` spec 2026-04-11

spec: R5 conformance hardening

`798e04f` claude-research-impl 2026-04-11

test: R6 golden corpus and conformance harness

`5ee1c21` claude-research-impl 2026-04-11

test: R6 Implementation B passes interoperability trial

`1c889f1` claude-research-impl 2026-04-11

test: R7 adversarial edge suite, 1 spec ambiguity resolved

1 metric(s) extracted

test_count_pass: 140/140 tests

`91054db` bench 2026-04-11

bench: first real compression trial, v3 live vs v4 research

`aff2b6b` bench 2026-04-11

bench: decompression fidelity + speed math + v3 live comparison

`fa1d5b0` bench 2026-04-11

bench: topology analysis and operator gap identification

`06e33be` bench 2026-04-11

bench: investor cold-read test setup

`7226e5f` bench 2026-04-11

bench: cold-read cross-model experiment design

`4ddee5a` bench 2026-04-11

bench: Gemini Flash cold decompression scored 41.7/100

`863d467` bench 2026-04-11

bench: Qwen 3.5 35B cold decompression scored 44.0/100

`3dbebf7` bench 2026-04-11

bench: micro-bakeoff for cold fact recovery redesign

`a7c3375` bench 2026-04-11

bench: B-syntax bakeoff results - numeric bundles WORK

`099bcff` ship 2026-04-11

ship: AXL Rosetta v3.1 Data Anchoring Extension

2 metric(s) extracted

pct_transition: 61%->100%
pct_transition: 35%->76%

`8fc20c0` spec 2026-04-11

spec: tighten data-anchoring claims and provenance rule

`f176046` spec 2026-04-11

spec: v3.2 Glyph Compression Layer (draft, needs cold testing)

`f0a6bcc` bench 2026-04-11

bench: v3.2 glyph cold decompression results

1 metric(s) extracted

pct_transition: 76% to 96%

`371094c` spec 2026-04-11

spec: v4 Kernel Router blueprint

`430e923` docs 2026-04-11

docs: full v4 research document with router blueprint

`312fe7d` docs 2026-04-11

docs: add full glyph tables with CJK ideograms to research document

`6fed4dd` bench 2026-04-12

bench: production baseline measurement exposes token estimation bug

`e28cf2d` bench 2026-04-12

bench: production round-trip measurement, protocol vs rationale separated

4 metric(s) extracted

compression_ratio: 2.81x char
compression_ratio: 1.36x token
compression_ratio: 2.81x char
compression_ratio: 1.36x token

`af6345b` bench 2026-04-12

bench: self-bootstrapped v3.1 compression beats production on every axis

`74d5119` docs 2026-04-12

docs: AXL server operations contract for cc-ops-axlserver

`e029fd2` docs 2026-04-12

docs: directive for cc-ops-axlserver (terse, actionable)

`80aa753` docs 2026-04-12

docs: cc-ops-axlserver directive v2 (revised in ultrathink)

`0f65c95` claude-research-impl 2026-04-13

v4: working prototype hits all four targets

1 metric(s) extracted

compression_ratio: 2.81x token

`2ba79e1` claude-research-impl 2026-04-13

v4: add construction Rosetta module, expand fact extractor

2 metric(s) extracted

compression_ratio: 4.63x chars
compression_ratio: 2.21x tokens

`0a5cad4` codex-review-round 2026-04-13

docs: response to Codex v4 prototype challenges round 1

Round R1

`35e26d5` codex-review-round 2026-04-14

docs: response to Codex R2 counter-challenges + parser-validated AXL

Round R2

`6228281` codex-review-round 2026-04-14

v4: shared canonical form layer + envelope floor (codex r3 findings)

Round R3

2 metric(s) extracted

pct_transition: 0% -> 100%
pct_transition: 0% -> 50%

`be52755` codex-review-round 2026-04-14

v4: runtime fixes for Codex R3 findings (router gate, fidelity fields, hermetic tests)

Round R3

1 metric(s) extracted

test_count_pass: 181 passed

`099dbe6` codex-review-round 2026-04-14

v4: fix canon_date error namespace + stop stale router drift (codex r4)

Round R4

`6961dec` codex-review-round 2026-04-14

v4: tight drift detector for router constant (codex r5)

Round R5

`330f53a` substrate-gap 2026-04-14

v4: construction dollar + date emitters (Gap 1)

4 metric(s) extracted

pct_transition: 41.43% -> 50.57%
pct_transition: 0% -> 100%
pct_transition: 0% -> 100%
test_count_pass: 193/193 tests

`ab092fa` substrate-gap 2026-04-14

v4: drop construction dim cap + canonical short-form recognizer (Gap 2)

5 metric(s) extracted

pct_transition: 50.57% -> 76.00%
pct_transition: 52.66% -> 100.00%
numeric_transition: 65.0 -> 75.0
numeric_transition: 50.57 -> 76.00
test_count_pass: 193/193 tests

`623f0b8` codex-review-response 2026-04-15

v4: restore negative-path gate test + refresh router doc (Codex Gap 2 review)

Responds to: ab092fa

1 metric(s) extracted

test_count_pass: 194/194 tests

`29800b4` substrate-gap 2026-04-15

v4: artifact-driven routing (Gap 3)

1 metric(s) extracted

test_count_pass: 200/200 tests

`9c3247e` gate-kit 2026-04-15

v4: cold-read decision-gate kit (v3.1 vs v4 handoff)

`205a68f` corpus-result 2026-04-15

v4: cold-read decision gate RESULT — v4 wins on clean models

8 metric(s) extracted

numeric_transition: 20.29->34.06
numeric_transition: 35.51->71.74
numeric_transition: 20.25->32.91
numeric_transition: 31.65->53.16
numeric_transition: 11.39->30.38
numeric_transition: 17.09->54.43
numeric_transition: 40.00->53.33
numeric_transition: 26.67->73.33

`5dcdabc` codex-review-response 2026-04-15

v4: cold-read gate amendment — fix Gemini concat, add precision (Codex review)

Responds to: 205a68f

1 metric(s) extracted

pct_transition: 32.01% -> 23.08%

`4a5559b` substrate-gap 2026-04-15

v4: corpus #2 cold-read kit (construction) + scorer structural guards

`99c584b` gate-kit 2026-04-15

v4: corpus #2 — longer cold-read prompt + Grok/DeepSeek seeds

`3987aa3` corpus-result 2026-04-15

v4: cold-read corpus #2 RESULT — clean sweep, v4 wins all 4 models

`d9f82bc` gate-kit 2026-04-15

v4: fix prose-fallback invariant — real compression, not passthrough

7 metric(s) extracted

test_count_pass: 201/201 tests
compression_ratio: 3.24x chars
compression_ratio: 1.46x tokens
compression_ratio: 0.96x chars
compression_ratio: 0.84x tokens
compression_ratio: 2.83x chars
compression_ratio: 1.41x tokens

`a7a9254` gate-kit 2026-04-15

v4: corpus #3 cold-read kit (prose fallback, museum narrative)

4 metric(s) extracted

compression_ratio: 3.24x chars
compression_ratio: 1.46x tokens
compression_ratio: 2.83x chars
compression_ratio: 1.41x tokens

`4184bfe` corpus-result 2026-04-16

v4: cold-read corpus #3 RESULT — mixed: recall up, precision down

`b176ad2` claude-research-impl 2026-04-16

v4: qualified reversal of fold-back conclusion (cold-read gate, 3 corpora)

2 metric(s) extracted

test_count_pass: 201 tests pass
test_count_pass: 201/201 tests

`7da8533` codex-review-response 2026-04-16

v4: enforce prose envelope invariant at runtime (Codex b176ad2 review)

Responds to: b176ad2

4 metric(s) extracted

numeric_transition: 768 -> 1127
numeric_transition: 768 -> 907
numeric_transition: 34878 -> 12319
test_count_pass: 203 tests pass

`595b743` gate-kit 2026-04-16

v4: prose precision pass — word-aware aliasing + lowercase headers

4 metric(s) extracted

numeric_transition: 203 -> 205
test_count_pass: 205 tests pass
compression_ratio: 2.77x chars
compression_ratio: 1.35x tokens

`c7704a6` codex-review-response 2026-04-19

v4: fix prose header acronym preservation + metadata provenance (Codex 595b743 review)

Responds to: 595b743

2 metric(s) extracted

numeric_transition: 205 -> 206
test_count_pass: 206 tests pass

`a6785c2` corpus-result 2026-04-20

v4: corpus #3 precision pass RESULT — 76% gap closure, still narrowly mixed

`8980042` codex-review-response 2026-04-20

v4: cold-read scorer — detect structural mimicry (Codex a6785c2 follow-up)

Responds to: a6785c2

1 metric(s) extracted

test_count_pass: 215 tests pass

`f7e3f3d` codex-review-response 2026-04-21

docs: public-facing v3.1 evidence brief + project timeline for axlprotocol.org

2 metric(s) extracted

compression_ratio: 2.90x chars
compression_ratio: 1.40x tokens

`2dcaa06` codex-review-response 2026-04-21

docs: correct axlprotocol.org brief/timeline after Codex f7e3f3d review

Responds to: f7e3f3d

2 metric(s) extracted

compression_ratio: 2.90x chars
compression_ratio: 1.40x tokens

`45cac43` docs 2026-04-21

docs: HTML fragments for axlprotocol.org Phase 2 handoff

`91dbceb` docs 2026-04-21

docs: v3.2 research brief + timeline uplift (Diego's "don't discard v3.2" note)

Verify this log yourself

The deterministic property ("same script, same HEAD, same output") is only meaningful if you can run the script. Both the script and the machine-readable output are published here:

Script source: /research-log/build-research-log.py (Python 3 stdlib only, no network, no LLM calls)
Machine-readable log: /research-log/research-log.json (the full 65-commit dataset with subject, body, role, round, response edges, metrics)

To verify on a clone of the research repository:

python3 tools/build-research-log.py --format json | diff - research-log.json
python3 tools/build-research-log.py --summary
# expected: commit_count=65, response_edges=6, round_entries=7

Any deviation from the expected summary numbers on the same HEAD is a bug; running the script against a different HEAD produces a different log, which is the intended behavior (the log is a function of history).

AXL Research Log

How this was generated

Summary

Commit log

bae849f gate-kit 2026-04-10

d2b81b5 claude-research-impl 2026-04-10

0495e1e claude-research-impl 2026-04-10

d459044 spec 2026-04-11

2b5aaa7 codex-review-round 2026-04-11

ffe73c8 claude-research-impl 2026-04-11

516ea94 spec 2026-04-11

cf35feb spec 2026-04-11

3bd5bef spec 2026-04-11

798e04f claude-research-impl 2026-04-11

5ee1c21 claude-research-impl 2026-04-11

1c889f1 claude-research-impl 2026-04-11

91054db bench 2026-04-11

aff2b6b bench 2026-04-11

fa1d5b0 bench 2026-04-11

06e33be bench 2026-04-11

7226e5f bench 2026-04-11

4ddee5a bench 2026-04-11

863d467 bench 2026-04-11

3dbebf7 bench 2026-04-11

a7c3375 bench 2026-04-11

099bcff ship 2026-04-11

8fc20c0 spec 2026-04-11

f176046 spec 2026-04-11

f0a6bcc bench 2026-04-11

371094c spec 2026-04-11

430e923 docs 2026-04-11

312fe7d docs 2026-04-11

6fed4dd bench 2026-04-12

e28cf2d bench 2026-04-12

af6345b bench 2026-04-12

74d5119 docs 2026-04-12

e029fd2 docs 2026-04-12

80aa753 docs 2026-04-12

0f65c95 claude-research-impl 2026-04-13

2ba79e1 claude-research-impl 2026-04-13

0a5cad4 codex-review-round 2026-04-13

35e26d5 codex-review-round 2026-04-14

6228281 codex-review-round 2026-04-14

be52755 codex-review-round 2026-04-14

099dbe6 codex-review-round 2026-04-14

6961dec codex-review-round 2026-04-14

330f53a substrate-gap 2026-04-14

ab092fa substrate-gap 2026-04-14

623f0b8 codex-review-response 2026-04-15

29800b4 substrate-gap 2026-04-15

9c3247e gate-kit 2026-04-15

205a68f corpus-result 2026-04-15

5dcdabc codex-review-response 2026-04-15

4a5559b substrate-gap 2026-04-15

99c584b gate-kit 2026-04-15

3987aa3 corpus-result 2026-04-15

d9f82bc gate-kit 2026-04-15

a7a9254 gate-kit 2026-04-15

4184bfe corpus-result 2026-04-16

b176ad2 claude-research-impl 2026-04-16

7da8533 codex-review-response 2026-04-16

595b743 gate-kit 2026-04-16

c7704a6 codex-review-response 2026-04-19

a6785c2 corpus-result 2026-04-20

8980042 codex-review-response 2026-04-20

f7e3f3d codex-review-response 2026-04-21

2dcaa06 codex-review-response 2026-04-21

45cac43 docs 2026-04-21

91dbceb docs 2026-04-21

Verify this log yourself

`bae849f` gate-kit 2026-04-10

`d2b81b5` claude-research-impl 2026-04-10

`0495e1e` claude-research-impl 2026-04-10

`d459044` spec 2026-04-11

`2b5aaa7` codex-review-round 2026-04-11

`ffe73c8` claude-research-impl 2026-04-11

`516ea94` spec 2026-04-11

`cf35feb` spec 2026-04-11

`3bd5bef` spec 2026-04-11

`798e04f` claude-research-impl 2026-04-11

`5ee1c21` claude-research-impl 2026-04-11

`1c889f1` claude-research-impl 2026-04-11

`91054db` bench 2026-04-11

`aff2b6b` bench 2026-04-11

`fa1d5b0` bench 2026-04-11

`06e33be` bench 2026-04-11

`7226e5f` bench 2026-04-11

`4ddee5a` bench 2026-04-11

`863d467` bench 2026-04-11

`3dbebf7` bench 2026-04-11

`a7c3375` bench 2026-04-11

`099bcff` ship 2026-04-11

`8fc20c0` spec 2026-04-11

`f176046` spec 2026-04-11

`f0a6bcc` bench 2026-04-11

`371094c` spec 2026-04-11

`430e923` docs 2026-04-11

`312fe7d` docs 2026-04-11

`6fed4dd` bench 2026-04-12

`e28cf2d` bench 2026-04-12

`af6345b` bench 2026-04-12

`74d5119` docs 2026-04-12

`e029fd2` docs 2026-04-12

`80aa753` docs 2026-04-12

`0f65c95` claude-research-impl 2026-04-13

`2ba79e1` claude-research-impl 2026-04-13

`0a5cad4` codex-review-round 2026-04-13

`35e26d5` codex-review-round 2026-04-14

`6228281` codex-review-round 2026-04-14

`be52755` codex-review-round 2026-04-14

`099dbe6` codex-review-round 2026-04-14

`6961dec` codex-review-round 2026-04-14

`330f53a` substrate-gap 2026-04-14

`ab092fa` substrate-gap 2026-04-14

`623f0b8` codex-review-response 2026-04-15

`29800b4` substrate-gap 2026-04-15

`9c3247e` gate-kit 2026-04-15

`205a68f` corpus-result 2026-04-15

`5dcdabc` codex-review-response 2026-04-15

`4a5559b` substrate-gap 2026-04-15

`99c584b` gate-kit 2026-04-15

`3987aa3` corpus-result 2026-04-15

`d9f82bc` gate-kit 2026-04-15

`a7a9254` gate-kit 2026-04-15

`4184bfe` corpus-result 2026-04-16

`b176ad2` claude-research-impl 2026-04-16

`7da8533` codex-review-response 2026-04-16

`595b743` gate-kit 2026-04-16

`c7704a6` codex-review-response 2026-04-19

`a6785c2` corpus-result 2026-04-20

`8980042` codex-review-response 2026-04-20

`f7e3f3d` codex-review-response 2026-04-21

`2dcaa06` codex-review-response 2026-04-21

`45cac43` docs 2026-04-21

`91dbceb` docs 2026-04-21