Naming is programming.
A type's field names are not labels the model ignores. They are instructions the model executes.
# Same type. Different computation.
churn_risk_tier: RiskTier # assess voluntary customer departure
x7: RiskTier # pick a union variantBoth declarations are structurally identical. A compiler cannot tell them apart, because it erases names before it runs. A language model tells them apart immediately, because the name is the first thing it reads. A semantic index type is a type declaration whose natural-language tokens, the field names, the descriptions, the variant names, act as computational indices for a neural consumer. They do not mark slots. They tell the model what to compute. The gap between the reader that erases names and the reader that executes them is where the subject starts.
The obvious pushback is that names were never free to change. Rename a field and your serializer breaks, your ORM stops mapping, your dispatcher misroutes. That is true, and it is exactly the distinction that makes this new.
Those consumers read a name as an opaque key. A key-reader is invariant under consistent renaming: change the name in the type and in every consumer at once, and behavior is preserved, because nothing depended on what the name meant, only on its matching itself elsewhere. That is the form of alpha-equivalence the runtime has always honored. λx.x and λy.y are the same function; rename throughout and nothing moves.
Every prior runtime reader that responds to a name conditions on its identity: whether it matches a counterpart elsewhere (a serializer key, an ORM column), or fits a fixed lexical rule (Fortran's I-N implicit typing). Identity-conditioning is what makes consistent renaming safe. Rename the counterpart too, or preserve the lexical class, and behavior is preserved, because nothing read what the name meant.
A neural consumer is the first reader that conditions on the name's meaning rather than its identity. There is no counterpart to coordinate with and no lexical rule to satisfy; the meaning is inferred from the token alone. So no consistent rename rescues you. churn_risk_tier to x7, applied everywhere at once, still degrades the output, because the meaning was load-bearing and the meaning is gone.
Prior runtime readers condition on a name's identity and stay invariant under consistent renaming. The neural consumer is the first that conditions on a name's meaning: renames that change meaning change its behavior, while renames that preserve meaning do not. That partition is the violation of alpha-equivalence, and it is the reason a schema is now part of the computation, not just a description of its output.
That is the thesis at the level a type theorist will accept and a builder will feel. Everything below is its consequences.
If you use Pydantic, Zod, JSON Schema tool definitions, function calling, grammar-constrained decoding, or typed agent tools, you are already authoring semantic index types, named or not. Every field name you choose is steering a model right now. The only question is whether you are doing it on purpose.
So the mental model most teams carry, schema as output format, is no longer sufficient. The schema is part of the inference surface: read going in, not only checked coming out.
| What you assumed | What is actually true |
|---|---|
| The schema defines output format | The schema participates in inference |
| Consistent renaming is safe | Consistent renaming can change behavior |
| Descriptions are documentation | Descriptions are executable guidance |
| Validation catches bad outputs | Validation bounds semantic failure |
Choosing churn_risk_tier over attrition_risk_tier selects between two analytical framings, voluntary departure versus passive loss, and the model computes the one you named. Schema authorship is computational authorship.
A semantic index type has split operational semantics: one declaration, read by two consumers that disagree about what a name is.
flowchart LR
A["Type Schema<br/>names, descriptions, variant names, constraints"] --> B["Formal Interpreter<br/>validator / type system"]
A --> C["Neural Interpreter<br/>language model"]
B --> D["Structural Channel<br/>which outputs are valid"]
C --> E["Semantic Channel<br/>which valid outputs become likely"]
D --> F["Structured Output"]
E --> F
The formal interpreter erases names and reads structure: arity, types, constraints, construction invariants. It decides what outputs are valid, and governs admissibility.
The neural interpreter reads names and descriptions as task framing: which domain this is, which distinction matters, what kind of computation to perform. It biases which valid output is likely, and governs salience.
Structure defines what can be said. Semantics defines what gets said. Both run on the same text.
Start with what is exactly true. Not a bound — an identity. For a field
where
-
$H(Y_f \mid x)$ — given this profile, before the field's name says anything, how undetermined the answer is. A profile that screams one tier — churned, two-month tenure, four support calls — pins it: near 0. A genuinely borderline profile, spread across three plausible tiers: open, around 1.5 bits. This is the room available, and the document sets it. -
$I(N; Y_f \mid x)$ — how much of that room the name takes:churn_risk_tierversusretention_offer_aggressivenesscollapsing "which of these values" down to one. This is the semantic channel, measured against this document — the only place it is real. -
$H(Y_f \mid x, N)$ — what remains once document and name have both spoken. Model noise: the part a better model removes, not a better schema.
The whole claim is the first term. A name can only instruct where the document was silent. A name's power is not a property of the name; it is exactly the uncertainty the document leaves behind. A precise field name is worth nothing on a profile that pins its own answer, and worth its full weight on a borderline one — and the identity says why:
Keep a ceiling, too; security claims need one. It is the same chain with the worst case made explicit:
| Type constraint | What it bounds | |
|---|---|---|
bool |
1 bit | worst-case room, any document |
| 4-variant union | 2 bits | worst-case room, any document |
unconstrained str
|
unbounded | type leaves the room uncapped |
Three authors set the three terms. The type sets the worst-case room
One caveat, before a careful reader supplies it: this sizes one field.
The dial gives you a development loop for mixed formal and neural systems.
flowchart LR
A["Steer with language<br/>precise names, clear descriptions, good variant names"] --> B["Observe where the model fails"]
B --> C["Harden the failure into structure<br/>narrow the type, add validators, split wide fields"]
C --> D["Spend semantic bandwidth, buy a structural guarantee"]
Steer with language first, because it is cheap and often enough. Watch where the model fails. Harden each failure into structure, converting "the name asks for this" into "the type permits only this." Each step moves a failure from the semantic channel, where it is merely unlikely, to the structural channel, where it is impossible. That is the entire craft, and the bound tells you the exchange rate.
A prompt instructs. A semantic index type does three jobs at once.
| Artifact | Instructs | Constrains | Certifies |
|---|---|---|---|
| Prompt | yes | no | no |
| Schema text alone | sometimes | weakly | no |
| Semantic index type in typed construction | yes | yes | yes |
It instructs through names and descriptions, constrains through types and unions, and certifies through construction: in a typed construction system, a value that exists has already satisfied everything its type declares. This is "Parse, Don't Validate" at whole-program scope, where construction is the proof and the constructed value carries its guarantee forward. The prompt template, the validation pipeline, and the orchestration glue stop being three artifacts kept in sync. They collapse into one declaration doing three jobs, read by the two interpreters above: the formal one that checks and the neural one that understands. That is why this is a systems concept, not a prompting trick.
If a name computes, a name is an instruction, and the moment instruction travels in a data channel you have the injection problem, the same vulnerability class as SQL injection at the schema level. A field name or description drawn from untrusted input is executable text reaching the model.
So the engineering story and the security story are one story, both about control of the semantic channel. The defenses are the familiar ones applied to schema text: provenance, sanitization, least privilege, and structural containment. The ceiling returns here as a security primitive: a narrow
Separate two claims that are easy to blur, and state each at its real strength.
The phenomenon is established. That neural consumers shift behavior under structure-preserving renaming is documented by converging evidence from three independent research communities: schema-guided dialogue, text-to-SQL, and code language models. This project does not discover that effect. It unifies three separate literatures under one abstraction, alpha-equivalence violation, which is a lower burden than discovery and a stronger position to argue from: the effect is not in question, only its framing was missing.
The in-domain measurement is predicted. Those communities studied dialogue, SQL, and code, not Pydantic structured output. So the specific magnitude, and the behavior under the decisive misleading condition, in this target domain, is predicted-not-yet-shown. A preliminary design for measuring it lives in experiment.md; it has not been run, and this document treats nothing in it as a result. Nor does the claim lean on it: the effect is directly observable by anyone who renames a field in their own schema and watches the output move — so the measurement is corroboration, not foundation.
The experiment isolates the prediction across four structurally isomorphic schema variants.
| Variant | Semantic content | What it tests |
|---|---|---|
| Baseline | precise names plus descriptions | correct semantic indexing |
| Names-only | names kept, descriptions removed | identifiers versus prose |
| Vacuous | field_1, OPTION_A, generic text |
semantic channel removed |
| Misleading | coherent wrong-domain naming | different computation, same structure |
The misleading condition is the decisive one. Vacuous naming only removes guidance, so it can only make the output noisier. Misleading naming, coherent but wrong-domain names over identical structure, tests whether the model computes a different function when the names point elsewhere. That makes it a directional claim — the output should not merely scatter, it should move toward the value the wrong names point at — so it demands a directional measurement, not a symmetric one. If the output stays structurally valid while the mass moves to where the names point, the schema is not formatting. It is part of the task.
- Paper:
semantic-index-types.md, the formal treatment - Experiment design:
experiment.md - Code:
sit/
Source code is MIT. Written content is CC BY 4.0. See LICENSE.