Path to modular tiny models

Translates the current single-model runtime into concrete registries, package contracts, loaders, resident-set management, orchestration, and evidence changes.

Experimental
Last verified
2026-06-25 00:00 UTC
Updated
Reading time
2 minutes

Translates the current single-model runtime into concrete registries, package contracts, loaders, resident-set management, orchestration, and evidence changes.

Implementation evidence: this topic is grounded in the reviewed GGUF.MiRust.com source snapshot. It documents observed code and artifacts without claiming broad deployment, model quality, or production readiness.

Required package contract

Every skill package needs a stable ID, semantic version, artifact SHA-256, SLM/runtime compatibility, exact base/tokenizer requirements, input/output schema, dependencies, authority scope, license, resource envelope, quality evidence, and rollback identity.

Runtime handles

Replace the global single model with explicit runtime, artifact, module, and session handles. Keep immutable artifact cache separate from resident decoded model state and per-session KV state.

Resident-set manager

Track installed, active, and parallel counts independently. Load or evict transactionally; refuse activation before exceeding host or device budgets; retain a known-good module set for rollback.

Initial orchestration order

  1. Deterministic rule or embedding router selecting one specialist.
  2. Confidence cascade with bounded escalation.
  3. Explicit pipeline with typed intermediate contracts.
  4. Small committee only when parallel resource evidence exists.
  5. Compatible shared-base adapters after base/tokenizer identity is cryptographically bound.

Evidence

Each route decision must record candidates, selected module, score/confidence, cost estimate, observed result, fallback, and exact artifact revisions.

Scope

This starter page defines the questions, boundaries, evidence, and failure modes that should be recorded before a capability is presented as supported.

Engineering considerations

  • Identify the source, version, target environment, and owner.
  • Separate observed values from estimates and externally reported values.
  • Record trade-offs, unsupported cases, and fallback behavior.
  • Link performance statements to a compatible benchmark methodology.

Verification questions

  • What exact artifact, revision, backend, and environment were reviewed?
  • Which assumptions could change the result?
  • Which data should be retained so another engineer can reproduce the conclusion?