SLM fixed header layout

Specifies every v1 fixed-header field, offset, type, validation rule, and compatibility implication.

Experimental
Last verified
2026-06-25 00:00 UTC
Updated
Reading time
2 minutes

Specifies every v1 fixed-header field, offset, type, validation rule, and compatibility implication.

Implementation evidence: this topic is grounded in the reviewed GGUF.MiRust.com source snapshot. It documents observed code and artifacts without claiming broad deployment, model quality, or production readiness.

Complete 108-byte SLM1 v1 header
Offset Bytes Field Validation
0 4 Magic Exactly SLM1.
4 4 version Exactly 1.
8 4 header_length At least 108 and not beyond file.
12 4 model_type Stored; current loader supports the implemented transformer contract.
16 4 flags Bit 0 is tied output.
20 4 vocab_size Must agree with tokenizer.
24 4 special_token_count Tokenizer metadata.
28 4 hidden_size Nonzero.
32 4 layer_count Nonzero.
36 4 head_count Nonzero.
40 4 kv_head_count Nonzero; current forward path also requires equality with head_count.
44 4 head_dim Nonzero and head_count × head_dim = hidden_size.
48 4 ffn_size Nonzero.
52 4 max_context Nonzero.
56 4 rope_theta f32 bits.
60 4 rms_norm_epsilon f32 bits.
64 8 tokenizer_offset Range must fit file.
72 8 tokenizer_length Range must fit file.
80 8 tensor_directory_offset Directory range must fit file.
88 4 tensor_count Multiplied by 64 with checked range.
92 8 tensor_data_offset Must be divisible by 64.
100 8 checksum Nonzero and equal custom checksum.

Forward compatibility

The parser allows a header length greater than 108 but does not parse extension fields. A future version should define extension ownership, unknown-flag behavior, minimum reader version, and canonical serialization.

Integer safety

Offset plus length uses checked arithmetic. u64 offsets are converted to host usize with explicit failure. Header dimensions are not trusted to size allocations until later shape checks.

Scope

This starter page defines the questions, boundaries, evidence, and failure modes that should be recorded before a capability is presented as supported.

Engineering considerations

  • Identify the source, version, target environment, and owner.
  • Separate observed values from estimates and externally reported values.
  • Record trade-offs, unsupported cases, and fallback behavior.
  • Link performance statements to a compatible benchmark methodology.

Verification questions

  • What exact artifact, revision, backend, and environment were reviewed?
  • Which assumptions could change the result?
  • Which data should be retained so another engineer can reproduce the conclusion?