Teleodynamic modular intelligence / Project guide

Compose intelligence by function, not bulk.

Choose a planning scale, the number of installed tiny models, the maximum active set, hyper-specialized skill roles, and an orchestration pattern. MiRust produces a transparent architecture handoff for a modular in-browser AI system without downloading or executing a model.

Open the composer Read the architecture

RequestRouter

Installed: NActive: K_A ≪ NParallel: K_P ≤ K_A

VerifyOutput

Core distinction

Installed capacity is not active compute

A modular system may cache many narrow specialists while activating only one or two for a request. The architecture should record N_total installed roles, K_active maximum active roles, and K_parallel concurrent roles rather than describing every downloaded model as simultaneously running.

Installed set

Controls download size, local storage, update surface, license inventory, provenance, and dependency resolution.

Active set

Controls resident weights, working buffers, context state, latency, and immediate power use.

Structural policy

Controls when the system may activate, deactivate, replace, retire, reserve, or choose no-op.

Planning tool only: this composer does not detect hardware, download weights, run inference, merge adapters, or verify model compatibility. It produces a transparent architecture handoff for the separately governed GGUF implementation project.

01Target envelope

Device planning profile A middle planning allowance for local experimentation. Validate on the target browser and device. Custom memory envelope (MiB) Used only when the custom device profile is selected. Context planning allowance (tokens) Feeds a generic state allowance. Replace it with model-specific KV-cache measurements later.

02Model scale and count

Default specialist scale A compact reference scale for tightly scoped generation or transformation experiments. Custom parameters (millions) Used only when Custom parameter count is selected. Weight precision planning factor Uses 0.60 estimated bytes per parameter, including a generic allowance for scales and metadata. Installed specialist slots Controls total model or adapter cache cost. Maximum active specialists Controls the maximum resident working set. It may not exceed installed slots. Maximum parallel specialists Controls simultaneous execution pressure. It may not exceed the active-set limit.

03Module topology

Independent specialist modelsEach selected skill is represented by a separate model artifact. A router or pipeline activates only the required subset. Shared base with skill adaptersOne base model remains resident while compact compatible adapters supply targeted skills. Compatibility must be verified. Hybrid base, adapters, and standalone expertsCombines a shared base with adapter-compatible skills and separate experts for tasks that need distinct architectures.

04Orchestration pattern

Sparse routerSelects one or a small active set for each request. Lowest average compute when routing is reliable. Explicit pipelineRuns selected roles in a declared sequence. Easier to inspect, but latency and error propagation accumulate. Confidence cascadeStarts with the cheapest specialist and escalates only when a declared acceptance rule fails. Parallel committeeRuns several specialists and aggregates results. Useful for cross-checking, but compute grows with the active set.

06Teleodynamic resource posture

Balanced viabilityTreats task benefit, memory, latency, and uncertainty as comparable planning pressures. Latency-first viabilityRaises the cost of parallel activation and favors routing, early exits, and smaller active sets. Quality-first viabilityAllows more verification or committee work when the declared resource envelope can support it. Minimum-residency viabilityPenalizes stored and active structure aggressively and makes no-op or retirement easier to select.

Planning rule: V(a) = expected task benefit − memory cost − latency cost − uncertainty cost

Activate, replace, retire, or reserve structure only when a candidate is affordable and has higher local value than no-op. This is an engineering policy sketch, not proof of teleodynamic self-maintenance.

Reset plan

Within the planning envelope 7.4% of the selected planning envelope

Peak runtime planning estimate114 MiB / 1,536 MiB

Installed cache59.3 MiB

Resident active weights18.3 MiB

Concurrent compute weights9.2 MiB

Workspace allowance48.0 MiB

Context-state allowance16.0 MiB

Orchestration allowance32.0 MiB

Topology counts

Base models: 0
Standalone experts: 6
Skill adapters: 0
Active specialists max: 2
Parallel specialists max: 1

Execution shape

01Request and constraintsTask, target, privacy, latency, and resource state
02Sparse routerProposes the smallest useful active set
032 resident / 1 parallelOnly compatible and locally valuable roles are activated; concurrency remains separately bounded
04Verify, merge, or no-opRecord the selected action, cost, evidence, and fallback

Assigned skill slots

01
Intent and domain routerRoute · conceptual-role
02
Retrieval and embedding specialistGround · conceptual-role
03
Summarization specialistTransform · conceptual-role
04
Verifier and contradiction checkerVerify · conceptual-role
05
Memory compression specialistMaintain · conceptual-role
06
Structured-output specialistEmit · conceptual-role

Planning warnings

No structural warnings were generated. Exact artifacts and measurements are still required.

Show estimation method

Weight estimate = parameters × selected bytes-per-parameter planning factor.
Adapter estimate = the greater of 1.5 MiB or 4% of the selected base-weight estimate.
Workspace estimate = the greater of 48 MiB or 22% of the weights participating concurrently.
Context-state allowance = context tokens × parallel count × square root of parameter millions × 2 KiB.
These formulas expose assumptions; they do not replace measured artifact, activation, KV-cache, GPU-buffer, tokenizer, or browser data.

Machine-readable architecture handoff

{
    "schema": "mirust-modular-tiny-model-plan/1.1",
    "plan_id": "MC-73472F2B3D",
    "generated_by": "MiRust Tiny Model Composer 1.7.0",
    "generated_utc": "2026-06-27T06:38:43+00:00",
    "status": "conceptual-architecture-plan",
    "boundary": {
        "guide_site": "MiRust.com",
        "implementation_project": "GGUF.MiRust.com",
        "executes_models": false,
        "downloads_model_weights": false,
        "claim": "Planning output only; validate exact artifacts, licenses, runtime support, quality, memory, latency, and safety in the implementation project."
    },
    "configuration": {
        "device_profile": "laptop",
        "memory_envelope_mib": 1536,
        "topology": "independent",
        "orchestration": "router",
        "policy": "balanced",
        "model_scale_parameters_millions": 16,
        "precision_factor": "q4",
        "model_slots_installed": 6,
        "model_slots_active_max": 2,
        "model_slots_parallel_max": 1,
        "context_tokens": 2048
    },
    "topology_counts": {
        "base_models": 0,
        "standalone_specialists": 6,
        "adapters": 0,
        "active_standalone_max": 2,
        "active_adapters_max": 0,
        "parallel_standalone_max": 1,
        "parallel_adapters_max": 0
    },
    "estimates": {
        "bytes_per_parameter": 0.6,
        "module_weight_mib": 9.16,
        "adapter_weight_mib": 1.5,
        "total_cached_weights_mib": 59.33,
        "active_weights_mib": 18.31,
        "concurrent_compute_weights_mib": 9.16,
        "workspace_mib": 48,
        "context_state_allowance_mib": 16,
        "orchestration_allowance_mib": 32,
        "peak_runtime_mib": 114.31,
        "budget_utilization_percent": 7.4,
        "fit_status": "within",
        "estimate_method": "Transparent planning factors; replace with measured artifact, workspace, and KV-cache values before implementation claims."
    },
    "teleodynamic_policy": {
        "weights": {
            "benefit": 1,
            "memory": 1,
            "latency": 1,
            "uncertainty": 1
        },
        "local_score": "V(a) = expected_task_benefit - memory_cost - latency_cost - uncertainty_cost",
        "operators": [
            "activate",
            "deactivate",
            "replace",
            "retire",
            "reserve",
            "noop"
        ],
        "activation_rule": "Activate a role only when it is compatible, affordable, and has positive estimated local value relative to noop.",
        "no_op_rule": "Select noop when no candidate activation or structural edit has sufficient estimated benefit inside the declared envelope."
    },
    "skills": [
        {
            "slot": 1,
            "role_id": "intent-router",
            "label": "Intent and domain router",
            "phase": "Route",
            "description": "Classifies the request and proposes the smallest relevant active set.",
            "status": "conceptual-role"
        },
        {
            "slot": 2,
            "role_id": "retrieval",
            "label": "Retrieval and embedding specialist",
            "phase": "Ground",
            "description": "Finds local evidence, ranks passages, or produces task-specific embeddings.",
            "status": "conceptual-role"
        },
        {
            "slot": 3,
            "role_id": "summarization",
            "label": "Summarization specialist",
            "phase": "Transform",
            "description": "Compresses a bounded source into a declared output form.",
            "status": "conceptual-role"
        },
        {
            "slot": 4,
            "role_id": "verification",
            "label": "Verifier and contradiction checker",
            "phase": "Verify",
            "description": "Checks evidence alignment, contradictions, schemas, or task-specific acceptance rules.",
            "status": "conceptual-role"
        },
        {
            "slot": 5,
            "role_id": "memory-compression",
            "label": "Memory compression specialist",
            "phase": "Maintain",
            "description": "Produces bounded summaries or retrieval keys for local conversational or task memory.",
            "status": "conceptual-role"
        },
        {
            "slot": 6,
            "role_id": "formatting",
            "label": "Structured-output specialist",
            "phase": "Emit",
            "description": "Maps accepted content into a narrow schema, syntax, or presentation contract.",
            "status": "conceptual-role"
        }
    ],
    "warnings": []
}

Three composition families

Separate model identity from the way capabilities are combined

Independent specialists

Each role has a separate artifact and may use a distinct architecture or tokenizer. Routing is explicit, but installed storage grows roughly with the number of specialists.

Shared base plus adapters

A common base remains resident while small, compatible skill adapters alter bounded behavior. Storage can be lower, but compatibility and interference become primary evidence requirements.

Hybrid mesh

Adapter-compatible language skills share a base while structurally different roles—such as retrieval, ranking, or verification—remain independent specialists.

Teleodynamic control layer

Every structural action must justify its cost

The composer expresses a local planning score rather than claiming autonomous self-maintenance: V(a) = expected task benefit − memory cost − latency cost − uncertainty cost. A candidate activation is accepted only when it is compatible, affordable, and more useful than no-op under the declared policy.

Sense the task and current envelope
Record the request class, available memory, active modules, latency pressure, uncertainty, and policy constraints.
Propose the smallest useful active set
Use a router, explicit pipeline, cascade, or committee policy to identify candidate roles.
Evaluate structural actions
Compare activate, deactivate, replace, retire, reserve, and no-op using decomposed costs and expected benefit.
Execute only in the implementation project
GGUF must verify exact artifacts, formats, licenses, compatibility, memory, latency, quality, rollback, and safety before running the plan.

Handoff boundary

The plan describes a target architecture; it is not a runnable bundle

The JSON output contains conceptual roles, transparent estimates, topology counts, resource policy, and warnings. It contains no model weights, source code, secrets, executable WebAssembly, or authority to publish a compatibility claim. Its three explicit limits are installed slots, resident active slots, and simultaneous parallel slots.

Review the MiRust–GGUF boundary Read the implementation handoff

Search MiRust

Compose intelligence by function, not bulk.

Installed capacity is not active compute

Installed set

Active set

Structural policy

Define installed, active, and parallel sets separately

Separate model identity from the way capabilities are combined

Independent specialists

Shared base plus adapters

Hybrid mesh

Every structural action must justify its cost

Sense the task and current envelope

Propose the smallest useful active set

Evaluate structural actions

Execute only in the implementation project

The plan describes a target architecture; it is not a runnable bundle