Explains the converted-trained prerequisite, bound evaluation sidecar, exact-case results, safety review, quality decision, and why the supplied fixture is not product evidence.
Implementation evidence: this topic is grounded in the reviewed GGUF.MiRust.com source snapshot. It documents observed code and artifacts without claiming broad deployment, model quality, or production readiness.
Prerequisites
The model source kind must be converted-trained. The evaluation sidecar must bind the model checksum and manifest checksum, declare assistant-quality gate version 1, pass task evaluation and safety review, and record quality_decision=accepted-for-assistant-quality.
Current runner
The native eval runner reads line-based cases and requires exact generated-text equality for each declared prompt and token limit.
Current fixture boundary
The checked-in two-case fixture validates evaluation plumbing for deterministic source data. Its own quality boundary states that it is not product assistant-quality evidence. A real gate requires task-representative datasets, safety protocols, reviewer identity, and retained raw outputs.
Scope
This starter page defines the questions, boundaries, evidence, and failure modes that should be recorded before a capability is presented as supported.
Engineering considerations
- Identify the source, version, target environment, and owner.
- Separate observed values from estimates and externally reported values.
- Record trade-offs, unsupported cases, and fallback behavior.
- Link performance statements to a compatible benchmark methodology.
Verification questions
- What exact artifact, revision, backend, and environment were reviewed?
- Which assumptions could change the result?
- Which data should be retained so another engineer can reproduce the conclusion?