Implementation evidence / Source snapshot 2026-06-25

The system, as implemented.

This section documents the inspected GGUF.MiRust.com source package at code, binary-format, runtime, browser, test, and artifact level. It replaces architecture-by-aspiration with an explicit inventory of what exists, what was independently rechecked, and what remains absent.

Source packageGGUF.MiRust.com-20260625.zipSHA-256 bd47fe8e91db5f5e5674ae4d77520b44248ab52117b170b345e1a0c87e636629
Workspace release0.1.0 / Rust 2021Four no-dependency crates
Inspected inventory141 files · 97,964,528 bytesTree manifest 1e017449eea9f70916ef5afd225d0ff7a4d3dc68292b41b6c7ddd81b914191e0
Website releaseMiRust 1.5.0Documentation only; runtime remains separate

Evidence boundary

What this release can now say

Statements below are grounded in the inspected source archive. “Implemented” means source and artifacts are present. “Rechecked” means an additional local check was run against the supplied bytes. Neither status means production deployment or assistant quality.

Implemented

Browser-local Rust/WASM runtime

A 114 KiB compiled WebAssembly module exposes a raw C-style ABI for allocation, model loading, generation, diagnostics, reset, stepping, and model release.

Implemented

Custom .slm v1 model container

The parser validates a 108-byte header, tokenizer section, 64-byte tensor entries, aligned tensor payloads, dtype contracts, shapes, and a custom non-cryptographic checksum.

Implemented

Scalar transformer execution

The runtime executes embedding lookup, RMSNorm, Q/K/V projection, RoPE, causal attention, KV caching, output projection, residual flow, SwiGLU FFN, final normalization, logits, and sampling.

Implemented

F32, Q8_0, and Q4_0 storage

Quantized matrices retain packed values and scales, dispatch directly to quantized matrix-vector kernels, and avoid full decoded f32 shadow copies.

Rechecked

WASM ABI recovery path

The supplied q8 artifact passed a local Node WebAssembly smoke for invalid UTF-8, null pointers, invalid lengths, sampling rejection, recovery generation, step-token generation, model free, and post-free rejection.

Not implemented

Modular Teleodynamic runtime

The source does not yet implement model registries, multi-model routing, adapters, structural operators, endogenous resource transitions, or a Teleodynamic slow loop. Those remain next-system work.

Actual workspace

Four zero-dependency Rust crates

01

tinyrustlm-runtime

WebAssembly and native library containing the model parser, tokenizer, tensors, quantized kernels, transformer execution, sampling, diagnostics, and eval runner.

02

tinyrustlm-slm-pack

Writes fixtures, validates .slm admission, validates provenance manifests, converts raw f32 trained sources, emits quantized variants, and enforces quality gates.

03

tinyrustlm-local-server

Loopback-only static server supporting GET and HEAD, percent decoding, traversal rejection, content types, and the app/WASM/model route surface.

04

tinyrustlm-browser-harness

Static contract crawler and optional loopback probe checking required DOM identifiers, local-only policy, WASM call markers, model routes, manifests, and content types.

Dependency finding: The workspace lockfile contains only these four local packages. No third-party Rust crates are declared. This narrows supply-chain exposure but shifts all parser, math, server, and test-harness correctness into project-owned code.

Execution path

Prompt to token, without a framework

  1. 01

    Boot static assets

    Handwritten JavaScript fetches the local WASM module and instantiates it with no imports.

  2. 02

    Transfer the selected artifact

    The browser fetches the entire local .slm file, allocates equal-size WASM memory, copies all bytes, and calls load_model.

  3. 03

    Validate and materialize

    Rust checks magic, version, header, checksum, tokenizer, tensor directory, ranges, dtypes, required names, and exact shapes; then creates runtime-owned tensor storage.

  4. 04

    Encode and prefill

    The selected BTOK or BPE1 tokenizer adds BOS/EOS. Generation removes the terminal EOS and forwards each prompt token to build the KV cache.

  5. 05

    Decode autoregressively

    Each new token runs the full four-layer scalar transformer, samples from logits, appends to runtime state, and stops at EOS, context capacity, or the requested token count.

  6. 06

    Return bounded state

    The browser reads result bytes and diagnostics JSON from WASM memory, renders the response, benchmark panel, provenance sidecar, and local transcript.

Binary contract

.slm v1 at byte level

Selected fixed-header offsets
OffsetFieldTypeMeaning
0Magic4 bytesASCII SLM1
4Versionu32 LEMust equal 1
8Header lengthu32 LEAt least 108 bytes
16Flagsu32 LEBit 0 permits tied output projection
20–52Model dimensionsu32 LEVocabulary, layers, heads, KV heads, head dimension, FFN, context
56RoPE thetaf32 LERotary frequency base
60RMS epsilonf32 LENormalization epsilon
64–80Tokenizer and directoryu64 LEOffsets and tokenizer length
88Tensor countu32 LENumber of 64-byte directory entries
92Tensor-data offsetu64 LEMust be 64-byte aligned
100Checksumu64 LECustom checksum with bytes 100–107 treated as zero

Each 64-byte tensor entry stores an FNV-1a name hash, dtype, rank, four dimensions, data offset and length, quantization-scale offset, and block size. Supported dtypes are f32 = 1, q8_0 = 2, and q4_0 = 3.

Read the full header contract · Read tensor-directory rules

Artifact inventory

Eight local model files, zero trained-quality claims

Artifact records from checked-in provenance manifests
ArtifactParametersBytesPrecisionAdmission
TinyLM-16M f3217,048,06468,194,944f32Runtime smoke only
TinyLM-16M q817,048,06417,160,000q8_0Runtime smoke only
TinyLM-16M q417,048,06410,657,728q4_0Runtime smoke only
Tiny fixture f324,82420,352f32Runtime smoke only
Tiny fixture q84,8248,832q8_0Runtime smoke only
Tiny fixture q44,8246,592q4_0Runtime smoke only
Tiny BPE fixture4,85620,544f32Runtime smoke only
Tiny tied-output fixture2,74411,968f32Runtime smoke only

Quality boundary: Every supplied manifest identifies its source kind as deterministic-smoke, declares no trained-quality claim, and requires replacement by a trained or evaluated model before a product-quality claim.

Open the populated model artifact directory

Critical implementation audit

Current gaps are now first-class documentation

No GGUF loader

Despite the project/domain name, the active runtime parses the custom .slm format, not GGUF. GGUF compatibility is not present in the inspected source.

Main-thread inference

The browser JavaScript calls blocking WASM exports directly. No Web Worker, WASM thread pool, SharedArrayBuffer path, or streaming token event loop is currently wired into the app.

Scalar CPU only

The executed path uses handwritten scalar Rust loops. No WebGPU, WebNN, SIMD-specific kernel, BLAS backend, or device dispatch abstraction is implemented.

Full-file transfer

The app fetches and copies the complete artifact into WASM memory. The transfer allocator rejects values over 128 MiB. There is no range loading, model sharding, Cache API, IndexedDB, or OPFS model store.

No GQA/MQA execution

The file header carries separate attention and KV-head counts, but forward scratch currently requires them to be equal. Grouped-query and multi-query attention are rejected.

Non-cryptographic container checksum

The internal checksum detects accidental byte changes but is not an authenticity mechanism. Artifact SHA-256 exists in external records; the browser does not verify it before admission.

No modular model system

One model is loaded at a time. There is no skill manifest ABI, module registry, router, cascade, committee, adapter hot-swap, shared base, or dependency resolver.

No Teleodynamic control loop

There is no endogenous resource variable, candidate structural action evaluator, activation/retirement operator, no-op decision trace, phase detector, or co-evolving structural state.

Memory package drift

The implementation README and AGENTS file route agents through a .uai/ directory, but the supplied source archive contains workspace.uai and no .uai/ directory. This is recorded as a source-package handoff defect.

Operational contract

Source files are now connected to runbooks

MiRust 1.5.0 adds a second implementation documentation root that follows the actual runtime through ownership, memory transfer, model admission, generation transactions, failure recovery, packer gates, browser controls, test contracts, release operations, and the concrete extension points for GGUF, modular skills, and a Teleodynamic controller.

State

Runtime ownership

One process-global Mutex<Option<Runtime>> owns the active model, KV cache, tokenizer, scratch buffers, logits, sampling state, result bytes, and diagnostics.

Transfer

Raw buffer contract

Host code allocates, copies, calls, reads, and deallocates under a 128 MiB single-transfer ceiling. Pointer and capacity identity are part of the ABI.

Recovery

Error-preserving state

Boundary failures write a stable message and diagnostics without automatically discarding an accepted model; model rejection and explicit release clear model-owned state.

Next gate

Worker and verified store

The next operational milestone moves blocking generation off the UI thread and admits cached artifacts only after cryptographic identity and manifest policy checks.

Read the implementation operations handbook Browse all operational contracts

Next implementation program

From single deterministic runtime to modular Teleodynamic system

  1. Gate A

    Trained model and reproducible quality evidence

    Import an exact trained source, retain source checksums and tokenizer identity, convert all precision modes, execute task and safety evaluations, and publish scoped results.

  2. Gate B

    Worker-owned incremental runtime

    Move WASM execution off the UI thread, expose cancellation and token streaming, define transferable-buffer ownership, and preserve deterministic recovery semantics.

  3. Gate C

    Persistent artifact store and cryptographic admission

    Add chunked downloads, resumability, Cache API or OPFS persistence, SHA-256 verification, signed manifests, quotas, eviction, and transactional activation.

  4. Gate D

    Skill package and module registry

    Define exact base/tokenizer compatibility, artifact identity, input/output contracts, dependencies, authority, quality evidence, and per-module resource estimates.

  5. Gate E

    Modular orchestration

    Implement sparse routing first; then bounded cascades, pipelines, committees, adapters, and explicit uncertainty/fallback contracts.

  6. Gate F

    Teleodynamic slow loop

    Add a measured resource state, local candidate-action score, activate/deactivate/replace/retire/reserve/no-op operators, decision traces, phase diagnostics, ablations, and falsification tests.

Site/runtime boundary: MiRust.com publishes this implementation map and evidence model. It does not embed the supplied Rust source, WebAssembly binary, or model artifacts in the WordPress theme or plugin. The separate implementation project remains the executable authority.