Document Type: Technical Architecture & Research Analysis
Status: Working Draft — Contains Open Questions
Date: February 2026
ODIE (Outcome Driven Intelligence Engine) is a continuous reasoning engine that anchors all intelligence, learning, and action around explicitly defined outcomes. Its core reasoning loop — Sense → Contextualize → Reason → Decide → Act → Observe → Adapt — operates against a set of data objects: Outcomes, Signals, Beliefs, Constraints, and Action Hypotheses.
A critical architectural challenge has emerged: how does ODIE evaluate decisions against multiple desired outcomes simultaneously, and how does the result of that collective evaluation inform memory retrieval from the knowledge substrate?
This challenge has three dimensions:
The Intent Problem: What appears to be "intent" — the system's current operational posture — is not a discrete object but an emergent property of collectively weighted outcomes. How is this emergence formalized?
The Pooling Problem: Real-world agents have dozens to hundreds of desired outcomes. Evaluating every action hypothesis against every outcome for every decision cycle is computationally expensive and cognitively unrealistic. How are outcomes scoped to a contextually relevant subset?
The Memory Alignment Problem: Memory retrieval must be informed not just by topical relevance but by the currently active outcome context. How does the outcome evaluation layer communicate with the memory substrate to produce retrieval that reflects the system's composite priorities?
This document examines the science underlying these problems, proposes architectural approaches, defines a phased MVP implementation, and catalogs open questions requiring further research and decision-making.
In conventional AI architectures, intent is often modeled as a discrete state variable — a classifier output, a slot in a dialogue frame, or a node in a planning graph. This treatment is useful for narrow applications (voice assistants classifying user intent, for example) but breaks down in systems where multiple competing objectives must be balanced continuously.
In the ODIE framework, which is grounded in Jobs-to-Be-Done (JTBD), Outcome-Driven Innovation (ODI), and the Customer-Driven Mission Achievement Process (CD-MAP), intent is defined differently:
Intent is the composite vector that emerges from evaluating a decision context against the full, weighted collection of desired outcomes.
This means intent is not upstream of outcomes — it is a consequence of them. When a system holds multiple desired outcomes, each weighted by importance and satisfaction, and evaluates a potential action against that collection, the resulting priority distribution is the system's intent for that decision moment.
This distinction matters because it eliminates the need for a separate "Intent" data object in the ODIE architecture and instead demands a process — outcome pooling — that correctly assembles and weights the relevant outcomes for each decision cycle.
In CD-MAP, intent is formally modeled as a collection of desired outcomes weighted by two dimensions:
The Opportunity Score — calculated as (2 × I) - S or similar weighted formulations — identifies where the system should focus attention. High importance combined with low satisfaction produces high opportunity, which produces urgency.
The composite of all opportunity scores across the active outcome set constitutes the system's effective intent. This is not a single value but a distribution — a priority landscape that shapes which actions are favored, which memories are relevant, and which signals warrant attention.
Three concepts require precise differentiation:
| Concept | Definition | Persistence | What Changes It |
|---|---|---|---|
| Outcome | A measurable desired future state (Direction + Measure + Object + Context) | Long-lived (years) | Mission redefinition |
| Motivation | The energy and priority weighting behind pursuing an outcome | Medium-lived (shifts with context) | Circumstantial change, new signals, satisfaction shifts |
| Intent | The emergent composite of all weighted outcomes for a decision context | Ephemeral (per decision cycle) | Any change to any participating outcome's weight |
Critically, motivation can shift without changing either the outcome or the chosen action. A system may still pursue "minimize customer response time" but with reduced urgency because satisfaction has improved. This shift in motivation doesn't require a new object — it's captured by the satisfaction dimension already present on the outcome.
Intent is even more transient. It exists only in the moment of evaluation and is recomputed for each decision cycle based on the currently active pool of outcomes and their current weights.
A realistic agent may manage 50–500 desired outcomes across multiple domains, hierarchies, and time horizons. Each decision cycle could involve evaluating multiple action hypotheses. Naively scoring every action against every outcome produces O(A × O) evaluations per cycle, where A is the number of candidate actions and O is the total outcome count.
This scaling issue is compounded by:
Human decision-making solves this problem through bounded rationality — the brain evaluates a contextually relevant subset of goals using heuristics, pattern matching, and emotional salience, trading completeness for speed.
This approach evolved for survival in physical environments where decisions had to be made quickly with incomplete information. The consequence is that human decision-making is fast but systematically biased:
These heuristics are the root of cognitive bias, prejudice, and systematic decision errors. They make humans fast reactors but imperfect reasoners.
An artificial cognitive system is not bound by these limitations. It does not have the brain's physical capacity constraints, does not require sub-second reaction times for most decisions, and can maintain awareness of outcomes that a human would neglect. This does not mean it should evaluate every outcome for every decision — that would be wasteful — but it means the scoping mechanism should be designed for accuracy and completeness, not for speed alone.
The system should be capable of varying how deeply it evaluates a decision based on the stakes and context:
| Decision Type | Example | Evaluation Approach |
|---|---|---|
| Trivial | Formatting a response | Minimal: check active constraints only |
| Routine | Selecting a standard workflow | Moderate: evaluate against directly-served outcome + parent + siblings |
| Consequential | Recommending a strategic action | Deep: evaluate against full active pool with constraint checking |
| Critical | Action with irreversible consequences | Exhaustive: evaluate against all outcomes, including dormant ones that might be affected |
| Emergency | Time-critical response required | Compressed: evaluate against top-N outcomes by opportunity score, skip low-priority |
This adaptive depth is a key architectural advantage over human cognition. The system allocates reasoning resources proportional to decision significance, rather than being constrained to a fixed cognitive bandwidth.
The following approaches are not mutually exclusive. The recommended architecture layers multiple mechanisms, with different approaches serving as primary or secondary filters depending on the decision type and evaluation depth.
Mechanism: Outcomes are organized in parent/child hierarchies. When an action hypothesis targets a specific outcome, the evaluation pool is expanded to include the parent outcome and its siblings (outcomes at the same level under the same parent). This captures related concerns without traversing the entire tree.
Example: An action targets "minimize time to resolve support ticket." The parent outcome is "maximize customer satisfaction with support experience." Siblings include "minimize effort required from customer" and "maximize accuracy of first response." All participate in evaluation.
Strengths: Natural, intuitive scoping. Leverages existing hierarchy. Computationally cheap.
Weaknesses: Assumes the hierarchy is well-structured. Misses cross-branch interactions (an action that serves a customer outcome might impact a financial outcome in a different branch). Requires maintenance of the hierarchy as outcomes evolve.
Applicability: Strong as a primary mechanism for routine decisions. Insufficient alone for consequential or critical decisions.
Mechanism: Signals are already linked to the outcomes they affect. When a signal (or cluster of signals) fires, the outcomes linked to those signals are activated and pulled into the evaluation pool. This creates a dynamic, context-responsive scope — the system evaluates against whatever outcomes are currently "energized" by incoming signals.
Example: Signals indicating thieves' guild activity activate outcomes related to safety and reputation. These outcomes join the evaluation pool alongside the directly-served outcome (acquire gold), causing the system to evaluate the haggling action against the full relevant context.
Strengths: Dynamic and context-responsive. Naturally captures cross-domain interactions. No manual scoping required.
Weaknesses: Depends on signal quality and signal-to-outcome linkage completeness. May over-activate if signals are noisy. May under-activate if important outcomes have no current signals.
Applicability: Strong as a complementary mechanism. Especially valuable for detecting emerging relevance that hierarchical scoping would miss.
Mechanism: Before outcome scoring begins, active constraints are checked against the action hypothesis. If the action violates a hard constraint, it is eliminated immediately. If it tensions a soft constraint, the constraint's associated outcomes are activated and added to the evaluation pool.
Example: "Do not violate data privacy regulations" is a hard constraint. Any action hypothesis that involves sharing customer data is eliminated without outcome scoring. "Minimize disruption to existing workflows" is a soft constraint — actions that tension it trigger the inclusion of workflow-related outcomes in the evaluation pool.
Strengths: Computationally efficient (eliminates actions early). Prevents constraint violations from being "outscored" by high-opportunity outcomes.
Weaknesses: Requires clean constraint definitions. Hard/soft boundary may be context-dependent.
Applicability: Should always be the first filter applied, regardless of decision depth.
Mechanism: Outcomes, beliefs, signals, and constraints are stored in a graph database with weighted edges representing relationships (reinforcement, conflict, dependency, causation). For a given action hypothesis and its directly-served outcome, the evaluation pool is defined as all outcomes within N hops in the graph, with influence decaying by distance.
Example: "Acquire gold" is 1 hop from "maintain livelihood" (parent), 2 hops from "reputation in village" (sibling of parent), and 3 hops from "guild relationship" (connected through reputation). At evaluation depth N=3, all participate. At N=1, only "maintain livelihood" does.
Strengths: Captures emergent relationships that hierarchy alone misses. Distance-weighted influence is natural and tunable. Supports the adaptive evaluation depth model (N varies by decision type).
Weaknesses: Requires a well-maintained graph. Graph construction and maintenance add complexity. Hop-based proximity may not capture all relevant relationships (some distant outcomes may be highly relevant due to strong but indirect connections).
Applicability: Strong as the primary mechanism for consequential and critical decisions. The hop depth N becomes the tunable parameter for adaptive evaluation depth.
Mechanism: Inspired by transformer attention mechanisms. Each outcome has an embedding. The current decision context (action hypothesis + active signals + current state) also has an embedding. Attention scores between the context and all outcomes determine which outcomes participate and with what weight.
Example: The decision context embedding (action: haggle, signals: guild activity, state: shop) produces high attention scores against safety outcomes, moderate scores against financial outcomes, and low scores against relationship outcomes. The evaluation pool is defined by a threshold on attention score.
Strengths: Learned rather than engineered. Can capture subtle, non-obvious relevance patterns. Naturally produces continuous weights rather than binary include/exclude.
Weaknesses: Requires training data. Introduces opacity (why did this outcome score high?). May overfit to observed patterns and miss novel situations. Conflicts with ODIE's principle of explainability.
Applicability: Potential future enhancement. Not appropriate for MVP. Could complement graph proximity as a secondary signal in later phases.
The MVP for ODIE is envisioned as a signal and feedback loop with basic outcome scoring, progressively enhanced with outcome pooling and then full belief revision. This section examines established models for handling multiple objectives in a continuous decision loop, evaluating their applicability to ODIE.
Origin: Decision theory / operations research.
Mechanism: Each action is scored by computing a weighted sum of its utility across multiple attributes (in ODIE's case, desired outcomes). The utility function U(a) for action a is:
U(a) = Σ wᵢ × uᵢ(a)
Where wᵢ is the weight (derived from importance and satisfaction) of outcome i, and uᵢ(a) is the utility of action a with respect to outcome i (the expected outcome delta).
Fit for ODIE: High. MAUT directly maps to ODIE's existing objects. Outcome importance/satisfaction produces weights. Action hypothesis expected_outcome_delta produces utilities. The composite score is the "intent vector" for that decision. MAUT is the most natural formalization of CD-MAP's weighted outcome model.
Limitations: Assumes outcome independence (the value of one outcome doesn't depend on the level of another). Requires cardinal utility measurement. Sensitive to weight calibration.
Recommendation: Primary model for MVP. Simple, interpretable, directly maps to ODIE data objects.
Origin: Evolutionary computation / engineering optimization.
Mechanism: Rather than collapsing multiple objectives into a single score, Pareto optimization identifies the set of actions where no action is strictly better than another across all objectives. The result is a Pareto frontier — a set of non-dominated solutions.
Fit for ODIE: Moderate. Useful when outcomes genuinely conflict and no single weighting is appropriate. For example, if "minimize cost" and "maximize quality" are both desired, Pareto analysis reveals the trade-off space rather than forcing a single answer.
Limitations: Produces a set of solutions, not a single recommendation. Requires a secondary mechanism (user preference, ODIE's belief system, or opportunity scoring) to select from the frontier. More complex to implement and explain.
Recommendation: Phase 3+ enhancement. Valuable for surfacing trade-offs in consequential decisions, but too complex for MVP. Could be used to present options to the user/orchestrator when the Pareto frontier reveals non-trivial trade-offs.
Origin: Thomas Saaty (1980). Operations research / decision science.
Mechanism: AHP structures the decision as a hierarchy: goal → criteria (outcomes) → alternatives (actions). Pairwise comparisons between outcomes determine their relative weights. Pairwise comparisons between actions on each outcome determine action scores. The composite produces a final ranking.
Fit for ODIE: Moderate. AHP's hierarchical structure aligns with ODIE's outcome hierarchy. Pairwise comparisons could be derived from opportunity scores rather than requiring human input.
Limitations: Pairwise comparisons become expensive as outcome count grows (n(n-1)/2 comparisons for n outcomes). The consistency requirement (transitive preferences) may not hold for all outcome sets. Adds complexity without clear advantage over MAUT for ODIE's use case.
Recommendation: Not recommended for core implementation. AHP's pairwise comparison mechanism is better suited to human-driven prioritization exercises than to continuous automated decision loops.
Origin: AI planning / constraint programming.
Mechanism: The decision problem is modeled as a constraint satisfaction problem where constraints must be satisfied first, then an objective function is optimized over the feasible region. This directly maps to ODIE's separation of constraints (must satisfy) and outcomes (optimize).
Fit for ODIE: High for the constraint pre-filtering layer. Constraints define the feasible action space; MAUT or another scoring model evaluates within that space.
Limitations: Pure CSP doesn't handle soft constraints or constraint prioritization well. Needs to be hybridized with an optimization approach.
Recommendation: Integrate into MVP as the constraint pre-filtering layer. Hard constraints eliminate actions. Soft constraints adjust weights. MAUT scores within the feasible space.
Origin: Probabilistic AI / decision theory.
Mechanism: A graphical model combining a Bayesian network (representing beliefs about the world) with a utility function (representing outcome preferences). Decisions are evaluated by computing expected utility under uncertainty — each action's utility is weighted by the probability of its consequences given current beliefs.
Fit for ODIE: High for the belief revision and reasoning layers. ODIE's beliefs (provisional models with confidence scores) and signals (observable evidence) map directly to Bayesian network nodes. The belief revision mechanism (update confidence based on supporting/contradicting evidence) is Bayesian inference.
Limitations: Full Bayesian inference is computationally expensive for large networks. Requires probability distributions that may be difficult to calibrate initially. Network structure must be defined or learned.
Recommendation: Phase 2-3 implementation for belief revision. The MVP can use simplified confidence scoring. Full Bayesian networks become valuable when the belief revision mechanism matures and when signal reliability needs formal treatment.
Origin: Multi-objective reinforcement learning (MORL).
Mechanism: The agent receives multiple reward signals (one per outcome) and must learn a policy that balances them. Approaches include scalarization (weighted sum, equivalent to MAUT), Pareto-based methods, and constrained RL (optimize one objective subject to constraints on others).
Limitations for ODIE: ODIE explicitly distinguishes itself from RL. ODIE uses belief revision, not reward maximization. Outcomes are stable and explicitly defined, not learned from reward signals. The "training phase" concept conflicts with ODIE's continuous reasoning model.
Recommendation: Not adopted as a framework, but specific MORL techniques — particularly scalarization and constraint handling — inform the implementation of MAUT and CSP+O within ODIE's non-RL paradigm.
| Model | MVP Fit | Outcome Pooling | Explainability | Computational Cost | Recommendation |
|---|---|---|---|---|---|
| MAUT | ★★★★★ | Via weights | High | Low | MVP primary |
| CSP+O | ★★★★☆ | Via constraints | High | Low-Medium | MVP constraint layer |
| Pareto | ★★☆☆☆ | Native | Medium | Medium | Phase 3+ |
| AHP | ★★☆☆☆ | Via hierarchy | High | High (pairwise) | Not recommended |
| Bayesian DN | ★★★☆☆ | Via network | Medium | High | Phase 2-3 belief revision |
| MORL | ★☆☆☆☆ | Via rewards | Low | High | Informational only |
Objective: Establish the foundational ODIE reasoning loop with the minimum viable set of objects and processes.
Core Objects:
Core Process:
SENSE → SCORE → FILTER → SELECT → EXECUTE → OBSERVE → UPDATE
Detailed Loop:
What Phase 1 Skips:
Objective: Introduce dynamic outcome pooling so the system evaluates decisions against a contextually relevant set of outcomes, and integrate this with the memory substrate for intent-aligned retrieval.
New Capabilities:
Enhanced Process:
SENSE → POOL → SCORE → FILTER → SELECT → EXECUTE → OBSERVE → UPDATE
The new POOL step:
Memory Integration Protocol:
When the knowledge substrate receives a retrieval request, it receives two inputs:
Retrieval scoring becomes:
relevance(memory) = α × semantic_similarity(memory, query)
+ β × outcome_alignment(memory, active_pool)
Where outcome_alignment measures how well the memory's historical context matches the currently active outcomes, and α/β are tunable parameters controlling the balance between topical relevance and intent alignment.
Objective: Introduce full belief management, Bayesian confidence modeling, and adaptive evaluation depth.
New Capabilities:
Enhanced Process:
SENSE → POOL → CONTEXTUALIZE → REASON → DECIDE → EXECUTE → OBSERVE → ADAPT
This is the full ODIE reasoning loop as specified in the core architecture document. Phases 1 and 2 are progressive approximations building toward this target.
┌─────────────┬──────────────┬───────────────┬──────────────┬──────────────┐
│ ENVIRONMENT │ ODIE │ OUTCOME │ MEMORY │ ACTION │
│ (Signals) │ (Reasoning) │ POOL │ (Cogniscient│ (Fluxio/ │
│ │ │ │ Substrate) │ Archer) │
├─────────────┼──────────────┼───────────────┼──────────────┼──────────────┤
│ │ │ │ │ │
│ Signal(s) │ │ │ │ │
│ detected ──►│ SENSE │ │ │ │
│ │ Ingest & │ │ │ │
│ │ classify │ │ │ │
│ │ │ │ │ │ │
│ │ ▼ │ │ │ │
│ │ Identify │ │ │ │
│ │ directly │ │ │ │
│ │ served │ │ │ │
│ │ outcome(s) ─►│ POOL │ │ │
│ │ │ Expand via: │ │ │
│ │ │ • Hierarchy │ │ │
│ │ │ • Signal │ │ │
│ │ │ activation │ │ │
│ │ │ • Graph │ │ │
│ │ │ proximity │ │ │
│ │ │ • Adaptive │ │ │
│ │ │ depth │ │ │
│ │ │ │ │ │ │
│ │ │ ▼ │ │ │
│ │ │ Active pool ─►│ RETRIEVE │ │
│ │ │ + weights │ Weighted by: │ │
│ │ │ │ • Semantic │ │
│ │ │ │ similarity │ │
│ │ │ │ • Outcome │ │
│ │ │ │ alignment │ │
│ │ │ │ │ │ │
│ │ │ │ ▼ │ │
│ │ ◄──────────┼───────────────┼─ Relevant │ │
│ │ │ │ memories │ │
│ │ SCORE │ │ │ │
│ │ MAUT eval: │ │ │ │
│ │ U(a) = Σ wᵢ │ │ │ │
│ │ × uᵢ(a) │ │ │ │
│ │ │ │ │ │ │
│ │ ▼ │ │ │ │
│ │ FILTER │ │ │ │
│ │ Constraint │ │ │ │
│ │ check: │ │ │ │
│ │ • Hard: elim │ │ │ │
│ │ • Soft: add │ │ │ │
│ │ outcomes ─►│ Re-pool if │ │ │
│ │ │ constraints │ │ │
│ │ ◄──────────┤ activated │ │ │
│ │ │ │ new outcomes │ │ │
│ │ ▼ │ │ │ │
│ │ SELECT │ │ │ │
│ │ Rank actions │ │ │ │
│ │ by composite │ │ │ │
│ │ score ───────┼───────────────┼──────────────►│ EXECUTE │
│ │ │ │ │ Action via │
│ │ │ │ │ Fluxio or │
│ │ │ │ │ recommend │
│ │ │ │ │ via Archer │
│ │ │ │ │ │ │
│ ◄───────────┼──────────────┼───────────────┼──────────────┤ │ │
│ World state │ │ │ │ ▼ │
│ changes │ │ │ │ Results / │
│ │ │ │ │ │ consequences │
│ ▼ │ │ │ │ │
│ New │ │ │ │ │
│ signal(s) ─►│ OBSERVE │ │ │ │
│ │ Compare: │ │ │ │
│ │ expected Δ │ │ │ │
│ │ vs actual Δ │ │ │ │
│ │ │ │ │ │ │
│ │ ▼ │ │ │ │
│ │ UPDATE ─────►│ Adjust │ │ │
│ │ │ satisfaction │ │ │
│ │ │ scores on │ │ │
│ │ │ affected │ │ │
│ │ │ outcomes │ │ │
│ │ │ │ │ │
│ │ Log feedback │ │ Store │ │
│ │ ────────────►│───────────────►│ action/ │ │
│ │ │ │ outcome │ │
│ │ │ │ episode │ │
│ │ │ │ │ │
│ │ [Return to │ │ │ │
│ │ SENSE] │ │ │ │
└─────────────┴──────────────┴───────────────┴──────────────┴──────────────┘
Key Interactions:
Human bounded rationality exists because of physical constraints: limited working memory (7±2 items), attentional bottlenecks, metabolic costs of neural computation, and evolutionary pressure for rapid threat response. These constraints produced the heuristic-based decision-making that behavioral economics has extensively documented — useful for survival but systematically biased.
An artificial cognitive system operating within the ODIE framework is not subject to these constraints. It can:
The system should not simply evaluate everything exhaustively. That would be wasteful and would introduce latency into decisions that require rapid response. Instead, the architecture should support adaptive processing allocation — a meta-decision about how deeply to evaluate the decision at hand.
This meta-decision can be driven by:
This produces a system that is faster than humans on routine decisions (because it doesn't overthink them) and more thorough than humans on consequential decisions (because it doesn't under-think them). The allocation itself can be logged and improved over time through the Observe → Update feedback loop.
There are risks to building a system that evaluates decisions more thoroughly than humans do:
These risks are manageable but must be designed for, not assumed away.
The knowledge substrate (Cogniscient in the ARCHER architecture) stores memories — episodic records of past events, actions, outcomes, and learned associations. Memory retrieval is currently conceived as primarily semantic: retrieve memories that are topically relevant to the current context.
This is insufficient for intent-aligned retrieval. A memory can be topically relevant but intent-misaligned — it relates to the right domain but would bias the system toward an action that conflicts with currently prioritized outcomes.
When ODIE assembles an active outcome pool for a decision cycle, that pool (with weights) is passed to the knowledge substrate as retrieval context. The substrate then scores candidate memories on two dimensions:
A memory's outcome alignment can be determined by:
score(memory, query, pool) = α × semantic(memory, query)
+ β × alignment(memory, pool)
+ γ × recency(memory)
+ δ × reinforcement(memory)
Where:
α, β, γ, δ are tunable weights (β increases for consequential decisions)semantic() measures topical relevancealignment() measures overlap between memory's outcome tags and the active pool, weighted by pool prioritiesrecency() captures temporal decayreinforcement() captures how many times the memory has proven useful (myelination analog)After a decision cycle completes:
This is the mechanism that connects ODIE's decision loop to Cogniscient's learning substrate. Decisions produce consequences; consequences produce signals; signals update outcomes; outcomes inform future retrieval. The loop is continuous and self-reinforcing.
The following items require further research, experimentation, or architectural decision-making before implementation.
| Question | Considerations |
|---|---|
| What is the right default pool size? | Too small and the system misses relevant outcomes. Too large and evaluation is wasteful. Empirical testing with representative decision scenarios is needed to calibrate. Starting with hierarchical scoping (parent + siblings) and measuring outcome neglect rate would establish a baseline. |
| How are cross-branch interactions detected? | Hierarchical scoping misses outcomes in different branches that are nonetheless affected by an action. Signal-activated pooling partially addresses this, but only if signals are correctly linked. A cross-impact matrix (which outcomes affect which others) could supplement the graph but requires manual or learned construction. |
| Should dormant outcomes ever activate? | Some outcomes may have high importance but zero current signals. Should they participate in evaluation based on importance alone, or only when signaled? The risk of exclusion is goal neglect; the risk of inclusion is evaluation noise. A periodic "dormancy check" that reviews un-signaled high-importance outcomes could mitigate both. |
| How is the outcome graph maintained? | As outcomes are added, removed, or restructured, the graph topology changes. Who or what maintains the graph? Manual maintenance doesn't scale. Automated graph evolution based on observed signal correlations and action-outcome relationships is more sustainable but introduces the risk of spurious connections. |
| Question | Considerations |
|---|---|
| How are MAUT weights calibrated? | Importance and satisfaction are the primary inputs, but their scales must be commensurate. If importance is scored 1-100 and satisfaction is scored on a different scale, the opportunity score is distorted. A standardized scoring rubric is needed, potentially domain-specific. |
| How is the expected outcome delta estimated? | Action hypotheses carry an expected delta, but where does this estimate come from? Initial values may be human-provided, but the system should learn to estimate deltas based on historical action-outcome episodes. This is a regression problem that could be modeled with simple statistical methods initially and graduated to more sophisticated models. |
| What happens when outcomes conflict? | MAUT handles conflicts through weighting — the higher-weighted outcome wins. But some conflicts may require surfacing to a human rather than auto-resolving. A conflict detection mechanism that identifies when top-ranked actions are close in score but diverge on key outcomes would enable "present the trade-off" rather than "pick a winner." This connects to Phase 3 Pareto analysis. |
| How is the α/β balance in retrieval scoring determined? | The balance between semantic relevance and outcome alignment in memory retrieval is likely context-dependent. Routine decisions may lean on semantic similarity (α high); consequential decisions may lean on outcome alignment (β high). The adaptive evaluation depth mechanism could modulate this parameter. |
| Question | Considerations |
|---|---|
| How is decision significance classified? | The trivial/routine/consequential/critical/emergency taxonomy is conceptually clear but operationally vague. What features of a decision determine its class? Candidate features include: number of outcomes affected, maximum possible outcome delta, constraint involvement, time pressure, and novelty. A decision classifier (potentially rule-based for MVP, learned for later phases) is needed. |
| What is the latency budget per decision class? | Trivial decisions should complete in milliseconds; critical decisions may warrant seconds or even minutes of evaluation. The architecture must support time-bounded evaluation with graceful degradation — if the time budget expires, return the best result found so far. |
| Can the system learn to calibrate its own depth? | Over time, the system will accumulate data on decision-depth vs. outcome-quality. Decisions evaluated at shallow depth that produced poor outcomes suggest the depth was insufficient. This meta-learning could be an advanced feature where the system improves its own evaluation depth allocation. |
| Question | Considerations |
|---|---|
| How are memories tagged with outcome context at formation? | When an episode is stored, the active outcome pool at that time should be recorded as metadata. This requires the memory substrate to accept outcome context from ODIE at write time, not just at read time. The interface between ODIE and the knowledge substrate must be bidirectional. |
| What is the decay model for outcome tags on memories? | If a memory was formed when "minimize support response time" was the active outcome, but that outcome has since been achieved and deprioritized, should the tag remain? Outcome tags could inherit the current weight of their associated outcome, naturally deprioritizing memories associated with satisfied outcomes. |
| How does reinforcement interact with belief revision? | In Phase 3, beliefs are revised based on evidence. If a belief revision changes the confidence in an outcome's importance, should that retroactively adjust the reinforcement scores of memories associated with that outcome? This creates a complex feedback dynamic that needs careful design to avoid instability. |
| Question | Considerations |
|---|---|
| How are constraints distinguished from high-importance outcomes? | "Avoid violating data privacy regulations" could be a constraint (hard boundary) or a high-importance outcome (weighted evaluation). The distinction matters: constraints eliminate actions; outcomes score them. The architecture should provide clear criteria for when something is a constraint vs. an outcome. Proposed heuristic: if violation is unacceptable regardless of other outcome benefits, it's a constraint. If violation could theoretically be justified by sufficient benefit elsewhere, it's an outcome. |
| Can constraints be context-dependent? | A soft constraint in one context may be hard in another. "Minimize disruption to workflows" might be soft during normal operations but hard during a critical production period. Constraints may need contextual activation rules, similar to outcomes having signal-activated relevance. |
| How do constraints propagate through the outcome graph? | If a constraint is activated, it may imply the activation of related outcomes. The mechanism for this propagation needs definition — does constraint activation use the same graph proximity mechanism as outcome pooling, or a separate pathway? |
| Question | Considerations |
|---|---|
| Where does outcome pooling run? | Outcome pooling involves graph traversal, signal matching, and weight computation. Should this be a service within ODIE, a separate microservice, or embedded in the knowledge substrate? Latency requirements and deployment architecture (particularly if using graph databases for the outcome graph) will drive this decision. |
| What is the communication protocol between ODIE and the knowledge substrate? | The active outcome pool must be communicated to the knowledge substrate for retrieval weighting. This could be a synchronous call (ODIE sends pool, waits for retrieval results) or asynchronous (ODIE publishes pool state, substrate subscribes and pre-indexes). The choice depends on latency requirements and whether the pool changes frequently enough to warrant streaming updates. |
| How is the decision cycle triggered? | The ODIE loop is described as continuous, but in practice it runs in response to triggers: new signals, user requests, scheduled evaluations, or threshold breaches. The trigger mechanism and its integration with the broader event architecture (PubSub, message queues) needs specification. |
| How is the system bootstrapped? | Before the system has accumulated action-outcome episodes, expected deltas must be estimated and outcome weights must be calibrated without empirical data. A bootstrapping protocol — potentially using domain expert input, transferred knowledge from similar deployments, or conservative default weights — is needed. |
| Phase | Focus | Key Deliverables | Dependencies |
|---|---|---|---|
| 1a | Core objects | Outcome CRUD, Signal ingestion, Constraint definitions | Data storage (Postgres/document store) |
| 1b | Basic scoring | MAUT implementation, Opportunity scoring, Constraint pre-filtering | Phase 1a |
| 1c | Feedback loop | Action execution → Observe → Update satisfaction cycle | Phase 1b, Fluxio integration |
| 2a | Outcome graph | Graph storage, Hierarchy + signal-based edge construction | Graph database (Neo4j or equivalent) |
| 2b | Outcome pooling | Hierarchical + signal-activated + graph proximity pooling | Phase 2a |
| 2c | Memory integration | Outcome-weighted retrieval, Memory tagging at formation | Phase 2b, Cogniscient integration |
| 3a | Belief system | Belief CRUD, Confidence scoring, Evidence linking | Phase 2c |
| 3b | Belief revision | Bayesian update mechanism, Propagation to outcomes/actions | Phase 3a |
| 3c | Adaptive depth | Decision classification, Variable evaluation depth, Pareto analysis | Phase 3b |