Test matrix and evidence

Summarizes unit, parser, quantization, generation, ABI, server, harness, browser, performance-panel, and Visual Studio checks recorded by the source package.

Experimental
Last verified
2026-06-25 00:00 UTC
Updated
Reading time
2 minutes

Summarizes unit, parser, quantization, generation, ABI, server, harness, browser, performance-panel, and Visual Studio checks recorded by the source package.

Implementation evidence: this topic is grounded in the reviewed GGUF.MiRust.com source snapshot. It documents observed code and artifacts without claiming broad deployment, model quality, or production readiness.

Source-reported coverage

The workspace memory records 92 Cargo tests plus strict rustdoc checks, JavaScript syntax checks, SLM admission, manifest validation, trained-source conversion in all three precision modes, fixture eval, quality gates, local server checks, mini harness checks, WASM ABI smoke, headless generation/error/UI audits, endurance, performance-panel smoke, and performance soak.

Locally rerun check

MiRust independently ran the supplied Node WASM ABI smoke against tinylm16-q8.slm. It passed invalid UTF-8, null pointer, invalid length, zero-token, invalid sampling, recovery generation, step-token, free-model, and post-free ModelNotLoaded checks.

Not rerun here

Cargo tests, rustdoc, packer conversion, Rust server/harness, and headless browser suites were not rerun because the packaging environment did not include the Rust toolchain and browser localhost policy blocked a direct page run.

Release interpretation

Source-reported test names are evidence leads, not independent certification. A production release should publish machine-readable test logs, toolchain identity, timestamps, and artifact hashes generated by the same CI run.

Scope

This starter page defines the questions, boundaries, evidence, and failure modes that should be recorded before a capability is presented as supported.

Engineering considerations

  • Identify the source, version, target environment, and owner.
  • Separate observed values from estimates and externally reported values.
  • Record trade-offs, unsupported cases, and fallback behavior.
  • Link performance statements to a compatible benchmark methodology.

Verification questions

  • What exact artifact, revision, backend, and environment were reviewed?
  • Which assumptions could change the result?
  • Which data should be retained so another engineer can reproduce the conclusion?