The Pipeline
Every inference begins with tokens. But tokens alone are insufficient for deterministic execution. They must pass through a verification pipeline that ensures stability and reproducibility.
Token Serialization
Raw prompt text is tokenized, but tokenization alone does not guarantee determinism. The serialized representation must be canonical:
- Stable Format: Canonical JSON ensures identical byte sequences for identical logical inputs
- No Ambiguity: Key ordering is defined, whitespace is normalized
- Version Locked: Serialization format is tied to a specific schema version
Without canonical serialization, small variations in formatting produce different hashes, breaking reproducibility.
Cryptographic Binding
The canonical input is hashed using BLAKE3:
input_digest = BLAKE3(canonical_json(tokens))
This digest becomes the cryptographic fingerprint of the input. Any change to the input—even a single character—produces a completely different hash.
Seed Derivation
Deterministic execution requires deterministic randomness. HKDF-SHA256 derives an execution seed from the input digest:
exec_seed = HKDF-SHA256(
input_digest,
model_version,
execution_context
)
The seed is deterministic: same inputs → same seed. This enables reproducible model behavior without storing random state.
Execution Flow
The pipeline follows this sequence:
- Tokenization: Prompt text → token IDs
- Canonicalization: Tokens → canonical JSON
- Hashing: Canonical input → BLAKE3 digest
- Seed Derivation: Digest → HKDF execution seed
- Inference: Seed + Model → output tokens
- Receipt Generation: Input digest + Output digest → cryptographic proof
Each step is deterministic. Replay is bit-perfect.
Audit Trails
Token-level trails extend the pipeline to capture routing decisions:
- Adapter Selection: Which LoRA was applied?
- MoE Gating: Which expert was activated?
- Quantization: What precision was used?
These decisions are logged and bound into the execution receipt, enabling hostile verification by third parties.
The Physics Analogy
Tokens flow through the system like particles through a physics simulation:
- Input Tokens: Initial particle positions
- Canonical Serialization: Coordinate normalization
- Cryptographic Hash: State checksum
- Seed Derivation: Initial velocity calculation
- Execution: Particle trajectory under fixed laws
- Receipt: Final state verification
The laws of motion are the runtime constraints. Determinism is the conservation principle.
AdapterOS Implementation
AdapterOS enforces this pipeline at the runtime level:
- Canonical JSON serialization (RFC 8785 compliant)
- BLAKE3 hashing for cryptographic binding
- HKDF-SHA256 for seed derivation
- Fixed-point arithmetic to reduce numeric drift
- Token-level audit trails for routing accountability
Patent application filed. Under review. Not an issued patent.
References
- RFC 8785: JSON Canonicalization Scheme (JCS)
- BLAKE3: https://github.com/BLAKE3-team/BLAKE3
- HKDF: RFC 5869
This is a canonical research note. For an interactive visualization, see ai.jkca.me/prompt-physics.