Q8 row scales and Q4 block scales

Specifies the current quantization granularity, payload-length rules, nibble ordering, sign extension, and validation needed for compatible converters.

Experimental

Last verified: 2026-06-25 00:00 UTC
Updated: 2026-06-25
Reading time: 2 minutes

Specifies the current quantization granularity, payload-length rules, nibble ordering, sign extension, and validation needed for compatible converters.

Implementation evidence: this topic is grounded in the reviewed GGUF.MiRust.com source snapshot. It documents observed code and artifacts without claiming broad deployment, model quality, or production readiness.

Q8_0

Each matrix element occupies one signed byte. One positive finite f32 scale is stored per row, and block_size must equal the number of columns. Runtime multiplication accumulates i8 × scale × input.

Q4_0

Two signed four-bit values are packed per byte, low nibble first. Values 8–15 sign-extend to −8 through −1. A positive finite scale is stored for every block_size values in each row; block size must be even and divide the row width.

Admission

The packer validator checks payload lengths, scale ranges, alignment, block divisibility, duplicate hashes, and non-finite values. A converter that changes any of these contracts must advance the container version or provide an explicit compatibility flag.

Scope

This starter page defines the questions, boundaries, evidence, and failure modes that should be recorded before a capability is presented as supported.

Engineering considerations

Identify the source, version, target environment, and owner.
Separate observed values from estimates and externally reported values.
Record trade-offs, unsupported cases, and fallback behavior.
Link performance statements to a compatible benchmark methodology.

Verification questions

What exact artifact, revision, backend, and environment were reviewed?
Which assumptions could change the result?
Which data should be retained so another engineer can reproduce the conclusion?

Q8 row scales and Q4 block scales

Q8_0 #

Q4_0 #

Admission #

Scope #

Engineering considerations #

Verification questions #

Q8_0

Q4_0

Admission

Scope

Engineering considerations

Verification questions