Overview

@2022hoPeople introduce value-guided task construals to model the process of adaptively selecting simplified representations of cause-effect relationships during task planning. Intuitively, a construal “picks out” details in a task to consider.

The problem of selecting a task construal is formulated as an approximately optimal trade-off between cognitive cost and task performance, or behavioral utility. This gives a normative, resource-rational account of planning (@2023hoRational).

The key idea is to treat model and policy selection as a two-level optimization process: an outer loop selects a construal, or simplified model of cause-effect relationships, by optimizing the value of representation over task construals; this is then used by an inner loop planning algorithm to compute the optimal policy.


Preliminaries: MDP models of sequential decision-making

MDP model of sequential decision-making tasks

A task representation consists of the following data: a state space with initial state ; an action space ; a transition function ; and a utility function . The value of a plan is defined for all states by the expected cumulative utility of using that plan:


Value-guided task construals

Construal

Suppose an agent has primitive cause-effect relationships assigning probabilities to state, action, and next-state transitions

where each is a potential function representing the local effect of taking some action. A construal is a subset of primitive cause-effect relationships that produces a task construal that shares the same state space, action space, and utility function with , but has a construed transition function

Behavioral utility, value of representation

Given a decision-maker with task construal , the behavioral utility of the computed optimal plan when starting at state is defined as its performance when interacting with actual transition dynamics :

The value of representation for the construal is

where is the cognitive cost defined as the cardinality of .


Model implementation

  • Given a value of representation function that assigns a value to each construal, decision-makers are modeled as selecting a construal according to a softmax decision rule: where is temperature parameter.
  • The process of revisiting and modifying construals at each stage of planning is represented as a sequential decision-making problem, the construal modification Markov decision process.

Construal modification Markov decision process

Given a set of cause-effect relationships, let be the set of all possible construals (i.e., the powerset of cause-effect relationships). The construal modification Markov decision process has state space and an action space corresponding to possible next construals . After selecting a new construal , the probability of transitioning from task state to is given by

that is, first calculating a joint distribution using the actual transition function and plan , then marginalizing over task actions .

The optimal construal modification value function is defined for all and by

where is the number of additional cause-effect relationships in compared to .