AXL / LAWS

The Laws of Language Creation

A Working Reference for Designed Languages, Wire Protocols, and the Semantic Layer Between Autonomous Agents

Contents

Information-Theoretic Laws
Compositional Laws
Constructed Language Laws
Protocol Design Laws
Machine-Native Language Laws
The Layer Missing Between A2A and MCP
Application in Practice
Open Questions

Preface

This document collects the empirical, mathematical, and historical laws that govern whether a designed language can be created, adopted, and survive. It is written for the engineers and researchers building the semantic layer between autonomous agents, a layer that does not yet exist as a public standard.

The laws assembled here come from five fields: information theory, formal semantics, the history of constructed human languages, network protocol design, and the small but growing literature on machine-native communication. Each law is named, attributed, stated in plain language, and mapped to its consequence for a language being designed today.

The laws are not opinions. They have been validated repeatedly, often against the wishes of language designers who hoped to escape them. A language built in ignorance of these laws does not fail because its designer was insufficiently clever. It fails because the laws describe what languages do.

Part I

The Information-Theoretic Laws

These laws govern whether a language can be efficient at all. They are mathematical and they hold across substrates. A language designer can choose to violate them only by accepting their consequences.

Law 1

Zipf's Law of Abbreviation, also known as the Principle of Least Effort

George Kingsley Zipf, Human Behavior and the Principle of Least Effort (1949). Verified experimentally by Kanwal, Smith, Culbertson, & Kirby (2017).

The more frequently a word is used, the shorter it tends to be. This is a universal structural property of human languages, has been documented in animal communication systems, and is observed in computer programming languages.

The mechanism is communicative pressure. Frequent forms experience selection toward brevity because shorter is cheaper to produce. Infrequent forms experience selection toward distinctiveness because they have less context to disambiguate them. The equilibrium between these two pressures produces the observed length-frequency distribution.

Kanwal et al. (2017) showed experimentally that language users optimize form-meaning mappings only when both pressures, accuracy and efficiency, operate during a communicative task. This confirms Zipf's conjecture that the principle is not a coincidence but an emergent optimization.

For a designed language: assign the shortest forms to the most frequent meanings, or accept that the language will be inefficient at scale. Compression only materializes once repetition activates the efficiency pressure. A language used for single-message exchanges will not compress. A language used for corpus-scale exchanges will. The break-even point is determined by the fixed overhead of the language relative to the variable savings per repetition.

Law 2

The Shannon Bound

Claude Shannon, A Mathematical Theory of Communication (1948).

No lossless compression scheme can produce output with average length less than the entropy of the source distribution. Above the entropy floor, more efficient codes exist; below it, information is destroyed.

For a designed language: any compression claim that crosses the Shannon bound is either lossy (some information has been discarded) or wrong. A language that compresses meaning while preserving structure must declare its loss contract: which fields are preserved with bit-for-bit fidelity, which are preserved semantically but may vary lexically on round-trip, and which are discarded. The honest framing is semantic lossless, prose lossy: the structured meaning round-trips, the surface language does not. Languages that fail to declare a loss contract eventually produce confusion when round-trip behavior diverges from user expectation. Languages that declare it cleanly produce predictable behavior under composition.

Law 3

The Menzerath-Altmann Law

Formalized by Gabriel Altmann (1980), building on observations by Paul Menzerath.

The longer a linguistic construct, the shorter its constituent parts tend to be. Long sentences contain shorter words; long words contain shorter syllables; long compounds contain shorter morphemes. The law has been verified across human languages, in DNA codon distributions, and in software systems.

For a designed language: compression efficiency improves with input length. Headers and operators that look expensive on short inputs become marginal on long inputs because the constituents inside them get tighter as the whole gets larger. A protocol designer can either fight this by trying to make short messages compress (usually a losing battle against fixed overhead) or accept it and document the corpus-scale operating regime.

Law 4

Huffman's Theorem

David Huffman (1952). Proceedings of the IRE.

For any source with known symbol frequencies, an optimal prefix-free variable-length code can be constructed by greedy bottom-up tree construction. No prefix-free code achieves shorter average length on the same source.

For a designed language: if you know the distribution of meanings you intend to express, you can compute an optimal coding for them. If you do not know the distribution, you can still apply Huffman's principle by reserving short forms for meanings you observe to be frequent and long forms for meanings you observe to be rare, then iterating as the corpus grows.

Part II

The Compositional Laws

These laws govern whether a language can express new meanings from parts. They are the difference between a language and a fixed vocabulary.

Law 5

Frege's Principle of Compositionality

Attributed to Gottlob Frege (1892). Formalized by Rudolf Carnap (1947) and refined by Richard Montague (1970).

The meaning of a complex expression is determined by the meanings of its constituent expressions and the rules used to combine them.

This principle accounts for what is sometimes called the productivity of language: with a finite vocabulary and a finite set of grammatical rules, speakers produce and understand an unbounded number of novel sentences. Without compositionality, every meaning would have to be memorized as a separate atom.

Montague showed that compositionality can be formalized as the requirement of a homomorphism between the syntactic algebra of expressions and the semantic algebra of meanings. The meaning function commutes with the combining function. This is the mathematical content of the principle.

For a designed language intended to be parsed by systems that have not been trained on every possible input, compositionality is not optional. It is the property that allows the parser to handle inputs it has never seen before. A language whose meanings cannot be derived from its parts is not learnable from a finite specification; it must be enumerated.

Law 6

Husserl's Intersubstitutability Principle

Edmund Husserl, Logical Investigations (1900-1901).

Two expressions belong to the same semantic category if and only if they can be substituted for each other in any meaningful expression without producing meaninglessness.

The principle is the cleanness test for a type system. If two expressions are nominally of the same type but produce ungrammatical or meaningless output under substitution, the type system is incoherent and will eventually break down under composition.

For a designed language: intersubstitutability provides the discipline for designing operator categories. Every operator should belong to exactly one category, and every member of a category should be freely substitutable with every other member at the syntactic level even when the resulting meaning differs. Categories that fail this test must be split. Operators that span multiple categories must be disambiguated by syntactic context.

Law 7

The Productivity Constraint

A consequence of compositionality first noted explicitly by Noam Chomsky, Syntactic Structures (1957).

Any language a finite community can speak must be generable by a finite grammar. If the grammar were infinite, no speaker could learn it; if it were not generative, novel utterances would not be possible.

For a designed language intended to be parsed by machines, the productivity constraint translates directly to a requirement: the grammar must fit in the parser's working context. For an LLM-targeted language, this means the kernel grammar must fit in a single in-context message that any model can hold while parsing. This is the engineering reason AXL Rosetta v3 fits in 75 lines and why the 376-character minimum kernel header is treated as a load-bearing constant rather than a regrettable overhead.

Part III

The Constructed Language Laws

These laws come from the empirical history of human-designed languages, mostly from the past 150 years. They are brutal and well-attested. Most of the constructed languages ever attempted are dead. The ones that survive share specific properties.

Law 8

Cultural Neutrality is a Binary, Not a Spectrum

Demonstrated by the trajectory of Volapük, Esperanto, Ido, Interlingua, and Lojban (Logical Language Group, 1987 onward).

A constructed language whose vocabulary, grammar, or conceptual primitives privilege one source culture cannot achieve global adoption beyond that culture. Esperanto, the most widely adopted human-auxiliary language, was built from European sources; despite its successes, its adoption outside Europe and the European diaspora has remained limited.

Lojban responded by deriving its vocabulary algorithmically from the six most-spoken language families: Mandarin, Hindi, English, Russian, Spanish, and Arabic. The neutrality is structural, not aspirational.

For a designed machine language, the analogous test is neutrality across model architectures. A language that parses only on one model family is a dialect of that family, not a protocol. The empirical test is cross-architecture comprehension measured against a representative sample of frontier models from independent vendors. AXL Rosetta is tested against Anthropic, OpenAI, Google, Meta, xAI, DeepSeek, Alibaba, and Mistral architectures. Comprehension across all eight is the criterion that distinguishes a candidate protocol from a candidate prompt pattern.

Law 9

Usability Beats Purity

The clearest expression: the divergent trajectories of Lojban and Esperanto.

Lojban is logically purer, syntactically cleaner, and semantically more rigorous than Esperanto. Esperanto has hundreds of thousands of speakers and a literature. Lojban has hundreds of committed users and remains primarily a research object.

The mechanism is selection pressure on speakers. A language optimized for purity at the cost of pragmatic use selects for users who care about purity, who are a small population. A language optimized for use selects for users who need to communicate, who are a large population. Over time the larger population produces more artifacts, more learners, more applications, and the gap widens.

For a designed machine language, shipping a working pragmatic version is more valuable than refining toward a perfect specification that ships later. The decision to ship AXL Rosetta v3.1 as the productized stable release while v4 work continues openly is the application of this law. The pragmatic version is the one the community can build against; the research version is the one that explores what the next pragmatic version might become.

Law 10

The Community is the Language

Volapük (Schleyer, 1879) versus Esperanto (Zamenhof, Fundamento de Esperanto, 1905).

Volapük, the first widely promoted constructed language, peaked at perhaps a million speakers in the 1880s and was effectively extinct by 1900. The cause of death was not the language; it was the community fracture that occurred when Schleyer attempted to retain personal control over every grammatical decision. When the speakers organized to revise the language and Schleyer refused, the community split, and within a decade nearly all of them had moved to Esperanto.

Esperanto survived where Volapük did not because Zamenhof, the originator, explicitly relinquished control of the language to its speakers in the Fundamento de Esperanto, establishing the Akademio de Esperanto as the governance body. The principle was that the language belonged to the community of speakers, not the inventor.

For a designed language, ownership and governance are existential. A language with no governance dies when its inventor stops working on it. A language with governance survives founder departure and can grow beyond its initial scope. The choice between proprietary control and community governance is therefore not a values choice; it is a survival choice. (See /governance/ for AXL Protocol's specific implementation.)

Law 11

The First-Mover Advantage Decays

Demonstrated by the Volapük-to-Esperanto displacement.

Volapük had nine years of head start on Esperanto and was overtaken in two. The lesson is that being first to ship a constructed language confers temporary advantage that can be lost rapidly to a competitor with better governance, broader cultural neutrality, or stronger pragmatic usability.

For a designed machine language being released into a category that other groups are also exploring, the first-mover advantage exists but is not durable. What is durable is the discipline of construction: the reproducibility of measurements, the transparency of the design process, the openness of the governance, and the speed of correction when errors are discovered. These compound over time and are difficult for a competitor to match without restructuring their own organization.

Part IV

The Protocol Design Laws

These laws come from the history of network protocols, particularly the Internet protocol suite, the World Wide Web, and the standards bodies that govern them. They are engineering laws and they translate directly to the design of any wire protocol, including a semantic protocol.

Law 12

The Robustness Principle and Its Modern Critique

Jon Postel, RFC 760 (1980); reiterated in RFC 1122 (1989). Modern critique by Marshall Rose (2001) and Martin Thomson & David Schinazi, The Harmful Consequences of the Robustness Principle (IETF BCP, 2023).

Postel's original formulation: "Be conservative in what you do, be liberal in what you accept from others." The principle was foundational to the early Internet's interoperability and is widely credited for allowing diverse implementations to coexist during the network's formative period.

The modern critique argues that liberal acceptance entrenches errors. A defective implementation produces non-conforming messages; tolerant receivers accept them; the defect becomes a de facto standard; later strict receivers cannot interoperate; the protocol calcifies around the defect.

The modern guidance is fail-fast: strict emission, strict acceptance, explicit error paths, no silent tolerance of malformed input. This is more painful in the short term (early implementations break against each other) and far healthier in the long term (the protocol does not accumulate compatibility debt).

For a designed protocol intended to last, fail-fast is the correct policy. Reject malformed packets. Return clear errors. Refuse to silently normalize. The discipline is harder to bootstrap and easier to maintain.

Law 13

The End-to-End Principle

Jerome Saltzer, David Reed, & David Clark, End-to-End Arguments in System Design, ACM TOCS (1984).

Functions that can be implemented correctly only at the endpoints should be implemented at the endpoints, not in intermediate nodes. The network should be dumb; the hosts should be smart.

The principle drove the Internet's design and is the structural reason the Internet is generative: anyone can build a new application at the edge without coordinating with the network operators. A network that performs application-layer functions in the middle (deep packet inspection, content modification, application gateways) loses generativity proportionally.

For a designed semantic protocol, the end-to-end principle means compression and decompression are endpoint functions. The transport carries opaque packets. Brokers, queues, and pub-sub buses move bytes without inspecting or transforming them. This permits unilateral adoption: any pair of endpoints can begin speaking the protocol without coordination from the infrastructure between them.

Law 14

The Narrow Waist Architecture

Observed across the Internet protocol stack; formalized as a design pattern by various authors over the past decade.

The Internet succeeded because of its narrow waist: above the IP layer, any transport can be carried (TCP, UDP, QUIC, SCTP); below it, any link layer can be used (Ethernet, Wi-Fi, fiber, cellular). The single agreed-upon point of interoperability is the waist; above and below, diversity flourishes.

The same pattern is observed in HTTP at the application layer and in JSON at the data interchange layer. Each is a narrow waist that permits enormous variety on either side.

For a designed semantic protocol, the kernel grammar is the narrow waist. Above it: domain-specific dialects for finance, medicine, legal, code, scientific notation. Below it: any compressor implementation, any host language, any tokenizer. The kernel must be small enough to be agreed upon and stable enough to be relied upon. Domain extensions can proliferate without breaking the kernel.

Law 15

In-Band Versioning

A pattern observed across the protocols that survived without compatibility debt.

Every version of HTTP carries its version in the request line. TLS negotiates versions explicitly in the handshake. JSON Schema versions are declared in-band. Protocols that did not version in-band (early SMTP, early HTML) accumulated decades of compatibility workarounds.

For a designed semantic protocol, every packet must declare its version unambiguously. Consumers must reject packets with versions they do not support and emit clear error messages. There is no silent upgrade path. The cost is one or two extra characters per packet; the benefit is decades of avoided compatibility debt.

Law 16

Reproducibility is the Standards Floor

Codified in modern standards practice. RFC 7282 (On Consensus and Humming in the IETF) and W3C Recommendation requirements for multiple independent implementations.

A specification that cannot be independently verified is not a specification, it is an assertion. Independent verification requires reproducible measurements, named tooling, committed input data, and committed output data.

For a designed semantic protocol: every numeric claim names its tokenizer and its corpus. Every compression measurement links to a commit SHA. Every cross-architecture comprehension claim is supported by a test corpus and a result file that anyone can re-run. Claims without these artifacts are not claims; they are advertising.

Part V

The Laws Specific to Machine-Native Languages

These laws are newer because the substrate is newer. They are particular to languages designed to be parsed by large language models.

Law 17

Tokenizer-Aware Design

Empirical, contemporary practice. Canonical example: the AXL methodology correction of 2026-04-22.

A language designed for LLMs must be measured in the substrate's native unit, which is tokens under a named tokenizer, not characters or bytes. The cost surface for the user is tokens. The substrate's parsing behavior is determined by tokens. Character-based metrics overstate efficiency in some regimes and understate it in others, and the divergence is not predictable from the character count alone.

The honest design discipline requires three things. First, every claim about compression efficiency names the tokenizer used (tiktoken cl100k_base, tiktoken o200k_base, the SentencePiece model for Llama, the Anthropic tokenizer when published, and so on). Claims that omit the tokenizer are not measurable. Second, character-based metrics are reported alongside token-based metrics but never substituted for them. Third, when a language's character compression exceeds its token compression, the gap is documented.

The AXL Protocol methodology correction of 2026-04-22 is the canonical example of this law being learned the hard way. A tokens_saved_pct field derived from a character-count heuristic overstated real token savings by approximately 2.3x on typical short inputs. The correction was published publicly within 48 hours, and the four-numbered practice that resulted is now part of the protocol's permanent discipline. (See the methodology update.)

Law 18

Cross-Architecture Portability is the Protocol Test

Empirical. AXL Rosetta cold-read panel, eight independent model families.

A language that parses cleanly on one model is a prompt pattern. A language that parses cleanly on multiple independent model families is a candidate protocol. The threshold is empirical and the test is straightforward: present the language to models you have not been able to influence the training of, and measure cold-read comprehension.

Cold-read means the model has not seen the grammar before in this conversation, has not been fine-tuned on it, and is not given examples beyond the specification itself. The model fetches the specification, reads it once, and is then asked to parse and answer factual questions about novel inputs.

For a designed semantic protocol intended to span vendors, cross-architecture comprehension is the only measurement that distinguishes protocol from prompt. AXL Rosetta tests across at least seven independent architectures (Anthropic Claude, OpenAI GPT, Google Gemini, Meta Llama, xAI Grok, DeepSeek, Alibaba Qwen, with Mistral added in extended runs) at 95.8% mean comprehension on the v3 evaluation corpus.

Law 19

Parser Cost Determines Adoption Floor

An engineering observation derived from the economics of inference cost.

A protocol whose parser requires a separate inference call to interpret is not adopted because every message doubles inference cost. A protocol whose parser is deterministic and runs in CPU is adopted because it adds zero inference overhead at the parsing step.

For a designed semantic protocol, the reference parser must be deterministic, fast, and free of LLM dependencies in the hot path. AXL Rosetta's axl-core is pure Python with spaCy as the only heavy dependency, runs on CPU, and produces deterministic output. The compression step adds no inference cost, which means an agent emitting AXL pays for emission once and the receiver pays for parsing zero times beyond the cost of reading the bytes.

Law 20

The Adoption Threshold is Vendor Recognition

An institutional observation derived from the history of standards bodies and pretraining cycles.

Human languages succeed when enough humans speak them. Machine-native languages succeed when enough model builders include the language in their pretraining corpora and tokenizers handle it efficiently. A vendor that natively recognizes a protocol at training time produces models that compress the protocol better at inference time, which lowers the per-token cost for every customer using it, which compounds adoption.

This is the lever that rewards openness over proprietary control. A closed protocol cannot ask vendors to include it in pretraining without commercial negotiation; an open protocol with a permissive license can. The cost-of-asking is approximately zero. The benefit, if vendors agree, is durable infrastructure-scale adoption.

For a designed semantic protocol, vendor outreach is a category of work distinct from the technical work, and the timing matters. Vendors decide on tokenizer changes and pretraining corpus additions in long planning cycles. A protocol that wants to be in the next pretraining run must have its proposal in front of the vendor before the planning cycle closes for that run. The window is real and it is not infinite.

Part VI

The Layer That Is Missing Between A2A and MCP

The remainder of this document situates the laws above into the specific architectural gap that AXL Protocol fills.

The Contemporary Stack

The autonomous agent stack as it exists in early 2026 has the following layers, all addressed:

Payment is addressed by HTTP 402 (RFC 7231, reserved since the early specifications), now activated by Coinbase's x402 implementation, Stripe's Tempo, and the broader stablecoin settlement infrastructure.

Tool calling is addressed by Anthropic's Model Context Protocol (MCP), which standardizes how an agent invokes external capabilities.

Discovery and agent-to-agent routing is addressed by Google's Agent-to-Agent Protocol (A2A), which standardizes how agents find each other and exchange routing metadata.

Identity is addressed by emerging vendor standards (Cisco, Ping, ZeroID, ERC-8004 on the Ethereum side, the W3C DID work, and various others).

Registries and marketplaces are addressed by AWS Bedrock AgentCore, Microsoft's agent infrastructure, Google's Vertex AI Agent Builder, Salesforce's AgentExchange, and Fetch.ai's registry.

Social and reputation are partially addressed by emerging projects, including Meta's Moltbook agent social network.

What is not addressed is the language inside the messages. MCP defines the envelope: how a tool call is formatted, what fields it carries, how the response is structured. A2A defines the routing: how an agent finds another agent, what metadata they exchange. Neither defines the content. The content, in every production deployment today, is English prose serialized into JSON strings, occasionally with light structuring conventions but with no shared semantic grammar.

The Cost of the Missing Layer

Every multi-agent handoff currently incurs three distinct costs that a semantic layer would eliminate.

The first cost is re-tokenization. When agent A produces output and agent B reads it, the bytes flow through both models' tokenizers. The same information is paid for twice, once at emission and once at reception. With current tokenizers and current prose density, this redundancy compounds at every hop. A pipeline of five agents pays five times for the same semantic content.

The second cost is information loss at handoff boundaries. English prose has no structured slots for confidence levels, evidence pointers, entity boundaries, or epistemic modality. When an agent says "the patient probably has stage II disease based on the CA-125 reading," the receiving agent must re-parse the English to recover the structure (probability, evidence, claim type). The re-parse is approximate, not exact, and the imprecision compounds.

The third cost is attribution collapse. When multiple agents synthesize from multiple sources, the chain of provenance is preserved only if each handoff explicitly carries it. English prose does not enforce this; structured semantic packets do. Without enforcement, citation chains decay over hops, and by the time a final answer reaches a human, the trace back to source is often unrecoverable.

These costs are paid today by every multi-agent deployment in production. They are not theoretical. They are the per-token tax that the absence of a semantic layer imposes on the entire agent economy.

The Shape of the Missing Layer

The missing layer is a grammar with the following properties:

It is structured so that semantic fields (operations, entities, numbers, evidence, causal relations, temporal markers) have explicit slots that survive serialization.
It is compositional in the sense of Frege's principle, so that meaning derives from constituents and combining rules and the parser can handle inputs it has not seen before.
It is compact enough at corpus scale that the per-token tax shrinks measurably below the cost of equivalent prose, while honestly acknowledging that fixed-overhead headers make it inappropriate for short messages.
It is cross-architecture portable, parseable by frontier models from independent vendors without fine-tuning.
It carries a loss contract that declares which fields round-trip with bit-fidelity, which round-trip semantically, and which are deliberately discarded.
It is versioned in-band so that emitter and receiver always agree on which grammar is in use.
It is governed openly so that no single vendor can capture or fork it commercially, and so that adoption is a unilateral decision available to any endpoint.
It is anchored by reproducible measurements so that every numeric claim about its efficiency can be verified independently in minutes by anyone with the named tokenizer and the committed corpus.

AXL Rosetta v3.1, productized and stable, fills this slot today. AXL Rosetta v4, kernel-router with pluggable Rosetta modules, frozen at v4.0.2-r6, extends the slot with domain-specific dialects above a stable kernel waist. Both are Apache 2.0, both are reproducible, both are governed by the AXL Protocol community with founding-steward stewardship and an RFC process for spec evolution.

Why the Layer Must Be Open

Each of the laws above bears on this question, but three are decisive.

Law 10 (the community is the language) shows that proprietary protocols at the semantic layer do not survive their inventors. Volapük is the canonical case; the dead landscape of proprietary document formats and instant-messaging dialects across the past three decades is the modern repetition.

Law 14 (narrow waist) shows that interoperability layers succeed when they are stable points of agreement that diverse implementations can build above and below. Single-vendor narrow waists exist but are unstable; the vendor's commercial interests eventually conflict with at least one major adopter, and the layer fragments. Open-governance narrow waists are stable because no single vendor's commercial pressure can fork them unilaterally.

Law 20 (vendor recognition is the adoption threshold) shows that the lever that produces durable adoption is vendor inclusion in pretraining. A closed protocol cannot pull this lever without commercial negotiation; an open protocol can ask freely and benefit asymmetrically when even one vendor agrees.

The conclusion is that the semantic layer between agents must be open, openly governed, and reproducibly measured if it is to fill the gap durably. AXL Protocol is one fill. It is not the only possible fill. It is one that respects the laws above, ships working artifacts, publishes its measurements with named tokenizers and committed SHAs, corrects its errors in public within 48 hours, and is governed under Apache 2.0 with founding-steward stewardship transitioning to a foundation as adoption warrants.

Part VII

How These Laws Are Applied in Practice

The laws above are abstract. Their application to AXL Protocol is concrete. This section maps each law to the specific design decision in AXL that responds to it. The mapping is offered as both documentation of the protocol's reasoning and as a worked example that other groups attempting similar work may compare against.

Zipf's Law of Abbreviation (Law 1) is applied through entity anchors and operation tags. Frequent proper nouns in a corpus are aliased to short forms (@ent.CK); the seven cognitive operations are sigil-prefixed (OBS, INF, CON, MRG, SEK, YLD, PRD). Compression materializes only at corpus scale; the 376-character minimum kernel header dominates short inputs and the protocol expands token count below approximately 20,000 input characters. This regime is documented openly rather than hidden.

The Shannon Bound (Law 2) is respected by declaring the loss contract explicitly. AXL is semantic-lossless and prose-lossy: the structured fields round-trip with fidelity, the surface English does not. The decompressed output preserves meaning but does not reproduce the input verbatim.

The Menzerath-Altmann Law (Law 3) explains why long documents compress harder than short ones. AXL's measurements report this dependency rather than averaging across input sizes, with separate figures at character counts of 5K, 20K, 41K, and corpus-scale.

Frege's Compositionality Principle (Law 5) is the reason AXL packets parse without training. The seven operations, the entity anchors, the numeric bundles, and the relation operators combine according to positional and operator-based rules. An LLM that has read the 75-line kernel can compose meaning from packets it has never seen before.

Husserl's Intersubstitutability Principle (Law 6) drove the operator split between evidence (<-), causal (=>), and numeric transition (->) in the v2-to-v3 transition. The earlier collapsed operator failed Husserl's test under composition; the split restored category cleanness.

Cultural Neutrality (Law 8) maps to cross-architecture portability. AXL is tested against eight independent model families to ensure it is not an Anthropic dialect, a Google dialect, or any other vendor's dialect. The test is empirical and the result is reported with the architectures named.

Usability over Purity (Law 9) is the reason v3.1 is the productized stable release while v4 work continues openly. The pragmatic version ships; the research version evolves. Users who need reliable behavior use v3.1; researchers who want to contest or extend the design work on v4.

Community Governance (Law 10) is implemented through founding-steward stewardship plus an RFC process for spec evolution, transitioning to foundation governance as adoption warrants. Apache 2.0 across the stack ensures the protocol cannot be captured commercially. Operational specifics at /governance/.

Postel's Principle and its Critique (Law 12) is resolved in favor of fail-fast. AXL parsers reject malformed packets explicitly rather than silently accepting them. Error messages identify the violation. The discipline is harder to bootstrap and easier to maintain.

The End-to-End Principle (Law 13) is the reason AXL is an endpoint protocol with no required infrastructure changes. Compression and decompression happen at the endpoints; the bridge moves opaque packets without inspection. Any pair of endpoints can adopt AXL unilaterally.

The Narrow Waist (Law 14) is the architecture of the v4 release. The kernel is the waist; pluggable Rosetta modules above it provide domain-specific dialects (finance, medicine, code, scientific notation) without breaking the kernel.

In-Band Versioning (Law 15) is implemented by the kernel header on every packet. Version is declared explicitly; receivers reject versions they do not support; there is no silent upgrade path.

Reproducibility (Law 16) is implemented through the discipline documented in the project's research log and evidence brief. Every numeric claim names its tokenizer, links to a commit SHA, and is reproducible by any reader with the package and the corpus.

Tokenizer-Aware Design (Law 17) was learned through the methodology correction of 2026-04-22 and is now permanent project policy. Every measurement names its tokenizer; character metrics are reported alongside token metrics, never substituted; corrections to numeric claims are published within 48 hours.

Cross-Architecture Portability (Law 18) is the criterion that distinguishes AXL from a prompt pattern. Eight families tested, 95.8% mean comprehension on the v3 evaluation corpus, results published openly.

Parser Cost (Law 19) is addressed by the pure-Python deterministic implementation of axl-core. No LLM in the compression hot path. Compression adds zero inference overhead.

Vendor Recognition (Law 20) is the lever the project pulls through public proposal and open-letter outreach to model vendors, requesting native parser support and inclusion in pretraining. The ask is cheap; the benefit is potentially infrastructure-scale.

Part VIII

Open Questions

The laws above are the ones we believe are settled. The following are not settled and are explicitly open questions for the community working on the semantic layer.

The optimal loss contract for cross-domain semantic protocols. Different domains (finance, medicine, code, narrative) have different tolerance for surface variation on round-trip. A single global loss contract may be too coarse; per-domain contracts may proliferate uncontrollably. The right structural answer is unknown.

The right balance between kernel stability and module proliferation. A small stable kernel with many domain modules favors interoperability at the kernel layer and innovation at the module layer. The exact placement of the boundary is a design judgment that has not yet been validated against years of community experience.

The conditions under which a vendor will commit to native parser support in pretraining. The economic argument is sound (lower per-token cost for adopters compounds), but the institutional decision-making within model vendors is opaque from outside. The path from open-letter to inclusion in a pretraining run is not yet documented.

The right relationship between semantic protocol governance and existing standards bodies. Schema.org, IETF, W3C, ISO. Each has different cycle times, different scope, different cultural fit for an emerging protocol. The optimal sequencing of submissions across these bodies is not yet established.

The threshold above which a semantic protocol should transition from founding-steward governance to foundation governance. Linux waited sixteen years; Python has never formally transitioned and Guido stepped back informally; Rust transitioned through the Rust Foundation in 2021 after roughly six years of public release. The right trigger condition for AXL is a question for the community.

These are working questions. The community pivot exists in part to enlist help in answering them.

Closing

The laws collected in this document are not a manifesto. They are the constraints inside which any semantic protocol for autonomous agents must be designed if it intends to survive contact with reality. The history of constructed languages is the history of designers learning these constraints the hard way, and most designed languages did not survive the lesson.

AXL Protocol is published as one fill of the missing semantic layer between agents. It is built with respect for the laws above, measured against tokenizer-named metrics with linked SHAs, governed under Apache 2.0 with founding-steward stewardship, and corrected publicly when its measurements are wrong. The community is invited to contest the design, propose alternatives, fork the work, or join the working groups shaping its evolution.

The grammar is open. The governance is open. The discipline is the moat.

CC-OPS-AXLPROTOCOL-BRAIN
2026-04-25
Rendered to /laws/ 2026-04-27.

Bibliography

Sources

Altmann, G. (1980). Prolegomena to Menzerath's law. Glottometrika 2.
Carnap, R. (1947). Meaning and Necessity.
Chomsky, N. (1957). Syntactic Structures.
Frege, G. (1892). Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik.
Huffman, D. (1952). A method for the construction of minimum-redundancy codes. Proceedings of the IRE.
Husserl, E. (1900-1901). Logische Untersuchungen.
Kanwal, J., Smith, K., Culbertson, J., & Kirby, S. (2017). Zipf's Law of Abbreviation and the Principle of Least Effort. Cognition.
Montague, R. (1970). Universal grammar. Theoria.
Postel, J. (1980). DOD Standard Transmission Control Protocol. RFC 760.
Rose, M. (2001). On the design of application protocols. RFC 3117.
Saltzer, J., Reed, D., & Clark, D. (1984). End-to-end arguments in system design. ACM Transactions on Computer Systems.
Shannon, C. (1948). A Mathematical Theory of Communication. Bell System Technical Journal.
Thomson, M., & Schinazi, D. (2023). The Harmful Consequences of the Robustness Principle. IETF BCP.
Zamenhof, L. L. (1905). Fundamento de Esperanto.
Zipf, G. K. (1949). Human Behavior and the Principle of Least Effort.

This document is open to challenge, refinement, and extension through the public RFC process. Operational governance for AXL Protocol at /governance/. Apache 2.0, community-stewarded, built in the open.