AXL / ROSETTA / V3.2 / RESEARCH

AXL Rosetta v3.2 - Glyph Compression Layer (Research Brief)

Status: Draft. Never shipped to production. Preliminary evidence tier.
Spec: spec/v3.2-glyph-compression.md (draft at commit f176046, 2026-04-11).
Does not replace: v3.1 remains the shipping production protocol at compress.axlprotocol.org.


What v3.2 is

AXL Rosetta v3.2 is a draft layered extension over v3.1 that replaces English labels inside packets with single-token Unicode glyphs. It is additive to v3.1: any v3.1 parser reads v3.2 output; unknown glyphs are passed through as opaque tokens.

The v3.2 layer is built on one measured insight and one measured non-issue:

The v3.2 draft catalogues 39 such symbols (37 of which are confirmed single-token) and defines how to use them for compression without breaking cold decompression by receiving LLMs.


The glyph palette (summary)

The full catalogue is in spec/v3.2-glyph-compression.md. High-level structure:

CategoryGlyphsUseToken cost
CJK ideograms financial, person, above, below, large, small, medium, high, day, month, year, transformation 1 each
Greek letters Δ μ σ π α β γ δ ε λ ω change, mean, deviation, ratio, primary, secondary, tertiary, marginal, error, rate, terminal 1 each
Arrows and operators ↑ ↓ → ← ⟹ ∵ ∴ ≈ ∑ ≡ increase, decrease, transition, derivation, causation, because, therefore, approximately, sum, defined as 1 for directional, 2 for logical
Currency symbols ¥ € £ ₹ ₽ currency-specific monetary markers 1 each

Glyphs compose. For example:


Measured cold-read results (preliminary)

The v3.2 glyph layer was cold-tested on three non-Anthropic models: Qwen 3.5, Gemini Flash, and DeepSeek. Each model received the v3.2 compressed form of the CloudKitchen investment memo with no spec, no examples, and minimal reconstruction instructions. Results were scored against a hand-curated ground-truth fact list using the benchmarks/fidelity_score.py weighted F formula:

ModelDollar recoveryEntity recoveryCausal recoveryTotal (weighted F)
Qwen 3.5100%90%100%98%
Gemini Flash95%100%100%96%
DeepSeek100%100%100%100%

Raw reconstructions are at benchmarks/cold_qwen_v32.md, benchmarks/cold_gemini_v32.md, benchmarks/cold_deepseek_v32.md.

Scorer context (load-bearing caveat)

These numbers were produced by the legacy weighted F scorer at benchmarks/fidelity_score.py, which uses hand-curated CloudKitchen-specific keyword lists. Later research (the 2026-04-14 decision gate) established that the legacy scorer is not portable across corpora: on non-CloudKitchen content, it flatlines and produces direction-confused deltas within noise. The primary scorer adopted for the later cold-read decision gate is measure_fidelity with an independent regex extractor applied uniformly to source and reconstruction.

The v3.2 glyph layer has not been re-scored against the primary extractor. The 96-100 percent numbers above are preliminary evidence from a single corpus on three models under a scorer that is known to be corpus-specific. A future cycle that re-scores v3.2 under the primary extractor against multiple corpora is the right way to upgrade the evidence tier.


Key findings (findings hold independent of scorer)

Finding 1: CJK glyphs do not trigger language switching

The primary risk of using CJK characters in packet form was that cold models would detect the characters and switch to Chinese or Japanese language mode, interpreting the entire packet as CJK text. This did not happen on any tested model:

All three models stayed in English because the packet structure (pipes, English operation codes, English labels) provided sufficient English-language anchoring. This is a measured non-issue, not a theoretical one.

Finding 2: Emoji are token poison, CJK ideograms are gold

Direct comparison on the same concept in cl100k_base:

What you typeTokensConcepts encoded
📈61 (increase)
金↑22 (financial + increase)
"revenue grew"22 (financial + increase)

Emoji and English encode the same information at the same token cost. CJK glyphs encode more information at less token cost. This is the unlock that motivated v3.2.


Why v3.2 did not ship

Three reasons, listed in decreasing order of load:

  1. Evidence tier is preliminary. Three models, one document, a corpus-specific scorer. The same rigor the 2026-04-14 decision gate brought to v3.1 versus v4 has not been applied to v3.2.
  2. Architecture question opened. Around the time v3.2 was drafted, the v4 Kernel Router architecture emerged. The router makes domain-specific vocabularies (including glyph-based ones) pluggable at the module level rather than baked into the protocol. The v3.2 glyph palette subsequently informed the v4 financial Rosetta module directly: the marker in v4 financial output is inherited from v3.2.
  3. CJK language-switching risk was mitigated but not eliminated. The three tested models all stayed in English, but the tested set was small. Broader-panel validation was the stated prerequisite in the v3.2 spec ("Status: Draft, needs broader cold testing before shipping").

v3.2 was not abandoned; it was absorbed. The glyph-layer thinking fed into the v4 Rosetta module architecture rather than shipping as an independent v3-line extension.


Current status


Evidence links