Provides a bounded operator process for invalid prompt, sampling, context, shape, decoding, and runtime failures while preserving useful diagnostics.
Implementation evidence: this topic is grounded in the reviewed GGUF.MiRust.com source snapshot. It documents observed code and artifacts without claiming broad deployment, model quality, or production readiness.
Capture before reset
Save the prompt bytes or approved redacted representation, model identity, sampling configuration, max token request, result message, numeric code, diagnostics JSON, browser elapsed time, and console entry.
Common classes
- Invalid UTF-8: host transfer or caller misuse.
- Invalid length: empty prompt range, zero new-token count, or sampling values outside the contract.
- Context exceeded: prompt/context capacity or continuation at the final slot.
- Shape/dtype/tensor errors: incompatible artifact or source defect.
- Generation failed: lock or uncategorized execution failure requiring source review.
Recovery
A valid later generate starts from a cleared generation state while retaining the accepted model. Use Reset for explicit context cleanup; use model reload only when identity or model-owned state is suspect.
Scope
This starter page defines the questions, boundaries, evidence, and failure modes that should be recorded before a capability is presented as supported.
Engineering considerations
- Identify the source, version, target environment, and owner.
- Separate observed values from estimates and externally reported values.
- Record trade-offs, unsupported cases, and fallback behavior.
- Link performance statements to a compatible benchmark methodology.
Verification questions
- What exact artifact, revision, backend, and environment were reviewed?
- Which assumptions could change the result?
- Which data should be retained so another engineer can reproduce the conclusion?