Resource estimation and uncertainty

Documents transparent weight, adapter, workspace, state, orchestration, and cache formulas together with uncertainty, sensitivity, and replacement by measured evidence.

Research
Last verified
Not verified
Updated
Reading time
1 minutes

Documents transparent weight, adapter, workspace, state, orchestration, and cache formulas together with uncertainty, sensitivity, and replacement by measured evidence.

Architecture guide: this topic defines a modular tiny-model planning contract. It does not claim that model artifacts exist, are compatible, or execute on this WordPress site.

Estimate components

  • Downloaded and cached weight bytes.
  • Resident base, expert, and adapter weights.
  • Tokenizer and metadata.
  • Activations, scratch workspace, and staging buffers.
  • Context and KV state.
  • GPU duplication or transcoding.
  • Orchestration, worker, and UI overhead.

Uncertainty

Publish low, central, and high estimates or sensitivity ranges when exact architecture data is unavailable. A single precise-looking number can conceal larger uncertainty than the differences between plans.

Promotion rule

Replace planning factors with measured values from the exact browser, implementation revision, artifact, context, and workload before changing a result to Observed.

Scope

This starter page defines the questions, boundaries, evidence, and failure modes that should be recorded before a capability is presented as supported.

Engineering considerations

  • Identify the source, version, target environment, and owner.
  • Separate observed values from estimates and externally reported values.
  • Record trade-offs, unsupported cases, and fallback behavior.
  • Link performance statements to a compatible benchmark methodology.

Verification questions

  • What exact artifact, revision, backend, and environment were reviewed?
  • Which assumptions could change the result?
  • Which data should be retained so another engineer can reproduce the conclusion?