Tensor naming and shape contracts

Lists required global and per-layer tensor names, exact shape relationships, tied-output behavior, and load-time index resolution.

Experimental
Last verified
2026-06-25 00:00 UTC
Updated
Reading time
2 minutes

Lists required global and per-layer tensor names, exact shape relationships, tied-output behavior, and load-time index resolution.

Implementation evidence: this topic is grounded in the reviewed GGUF.MiRust.com source snapshot. It documents observed code and artifacts without claiming broad deployment, model quality, or production readiness.

Global tensors

  • tok_embeddings.weight: vocabulary × hidden.
  • norm.weight: hidden.
  • output.weight: vocabulary × hidden unless tied-output flag is set.

Per-layer tensors

For each layers.N prefix: attn_norm.weight, ffn_norm.weight, wq.weight, wk.weight, wv.weight, wo.weight, w1.weight, w2.weight, and w3.weight.

Shape relationships

  • Q/K/V/O projections use hidden-width matrices in the current equal-head implementation.
  • W1 and W3 project hidden to FFN; W2 projects FFN back to hidden.
  • Norm vectors match hidden size.
  • Embedding and output rows match vocabulary.

Resolved indices

During load, the model stores top-level and per-layer directory indices. Token execution therefore avoids formatting layer names and linearly scanning the directory on every forward pass.

Versioning risk

Names are an ABI. Renaming, transposition, bias introduction, GQA shapes, or alternative FFN layout requires a new model-type contract or format version.

Scope

This starter page defines the questions, boundaries, evidence, and failure modes that should be recorded before a capability is presented as supported.

Engineering considerations

  • Identify the source, version, target environment, and owner.
  • Separate observed values from estimates and externally reported values.
  • Record trade-offs, unsupported cases, and fallback behavior.
  • Link performance statements to a compatible benchmark methodology.

Verification questions

  • What exact artifact, revision, backend, and environment were reviewed?
  • Which assumptions could change the result?
  • Which data should be retained so another engineer can reproduce the conclusion?