Implementation operations

Implementation handbook / Source snapshot 2026-06-25

Operate the system that exists.

This handbook turns the reviewed TinyRustLM source into explicit state, ownership, resource, failure, promotion, and release contracts. It documents the current single-model scalar runtime and the exact interfaces that must change before Worker execution, persistent verified storage, true GGUF loading, modular tiny models, or Teleodynamic structural control can be claimed.

Current operational constants

Hard limits and state boundaries

Runtime owner1 global runtimeSerialized by a mutex
Transfer ceiling128 MiBOne raw allocation
Sampling cap1,024 candidatesFixed transient arrays
ExecutionScalar CPUMain browser thread

Model-load transaction

Parse, tensor materialization, scratch allocation, KV allocation, and logits allocation must all succeed before the model becomes ready.

Generation transaction

Every prompt clears generation state, encodes, prefills, decodes, and publishes one result or one stable error while retaining an accepted model where safe.

Presentation separation

Conversation transcript, runtime context, and model residency are independent state planes with separate controls.

Evidence separation

Source presence, runtime smoke, task evaluation, safety review, browser verification, and production support are separate gates.

Resource formulas

Memory is derived from the actual code path

Core runtime allocations
Allocation Formula TinyLM-16M-shaped value
Forward scratch 4 × (10H + 3F + C) 47,104 bytes
KV cache 2 × L × C × K × D × 4 8,388,608 bytes
Logits V × 4 1,040 bytes
Host transfer Complete artifact bytes Up to 134,217,728 bytes

These values exclude allocator overhead, JavaScript response buffers, token vectors, model storage, the temporary transfer allocation, and browser fetch buffering.

Operator flow

Diagnose by stage, not by symptom

  1. 01

    Record immutable identities

    Capture source archive, runtime/WASM hash, model hash and internal checksum, manifest, browser, device, and UTC time.

  2. 02

    Classify load stage

    Separate fetch, transfer allocation, parser admission, tensor materialization, scratch/KV allocation, and provenance rendering.

  3. 03

    Classify generation stage

    Separate UTF-8 transfer, sampling configuration, tokenization, context, forward execution, sampling, and decode.

  4. 04

    Preserve diagnostics

    Capture numeric code, stable message, diagnostics JSON, prompt policy, elapsed time, and model identity before reset.

  5. 05

    Recover narrowly

    Retry valid generation when model state is intact, reset context when needed, reload only when model identity or model-owned state is suspect, and quarantine incompatible bytes.

  6. 06

    Promote through evidence gates

    Structural validation and runtime smoke precede task evaluation, safety review, browser compatibility, preview support, and stable support.

Next implementation boundary

Four concrete transitions

Worker ownership

Move WASM and model state behind typed commands, request IDs, token events, and cancellation.

Verified local store

Chunk, hash, sign, validate, commit, quota, evict, and recover immutable artifact records.

True GGUF path

Add a separate versioned parser and architecture adapter; do not alias SLM1.

Modular Teleodynamic path

Add skill packages, handles, resident-set control, sparse routing, measured resources, structural actions, no-op traces, and falsification tests.