Generation failure runbook

Provides a bounded operator process for invalid prompt, sampling, context, shape, decoding, and runtime failures while preserving useful diagnostics.

Experimental

Last verified: 2026-06-25 00:00 UTC
Updated: 2026-06-25
Reading time: 2 minutes

Provides a bounded operator process for invalid prompt, sampling, context, shape, decoding, and runtime failures while preserving useful diagnostics.

Implementation evidence: this topic is grounded in the reviewed GGUF.MiRust.com source snapshot. It documents observed code and artifacts without claiming broad deployment, model quality, or production readiness.

Capture before reset

Save the prompt bytes or approved redacted representation, model identity, sampling configuration, max token request, result message, numeric code, diagnostics JSON, browser elapsed time, and console entry.

Common classes

Invalid UTF-8: host transfer or caller misuse.
Invalid length: empty prompt range, zero new-token count, or sampling values outside the contract.
Context exceeded: prompt/context capacity or continuation at the final slot.
Shape/dtype/tensor errors: incompatible artifact or source defect.
Generation failed: lock or uncategorized execution failure requiring source review.

Recovery

A valid later generate starts from a cleared generation state while retaining the accepted model. Use Reset for explicit context cleanup; use model reload only when identity or model-owned state is suspect.

Scope

This starter page defines the questions, boundaries, evidence, and failure modes that should be recorded before a capability is presented as supported.

Engineering considerations

Identify the source, version, target environment, and owner.
Separate observed values from estimates and externally reported values.
Record trade-offs, unsupported cases, and fallback behavior.
Link performance statements to a compatible benchmark methodology.

Verification questions

What exact artifact, revision, backend, and environment were reviewed?
Which assumptions could change the result?
Which data should be retained so another engineer can reproduce the conclusion?

Generation failure runbook

Capture before reset #

Common classes #

Recovery #

Scope #

Engineering considerations #

Verification questions #

Capture before reset

Common classes

Recovery

Scope

Engineering considerations

Verification questions