Wolpert, D. H., & Kinney, D. B. (2024). A Stochastic Model of Mathematics and Science. Foundations of Physics, 54(2), 1–67. https://doi.org/10.1007/s10701-024-00755-9


Summary

In this paper, both individual agents and the “ground truth” universe are represented as stochastic mathematical systems, and an agent’s reasoning process is represented by sequences of question-answer distributions. An agent’s performance is evaluated by calibration with the universe-SMS; that is, the universe-SMS acts as an oracle and arbiter of “correct” answers.

The mathematician-SMS and universe-SMS—the accepted claims by a far-future community of mathematicians—have no a priori relationship. On the other hand, the scientist-SMS is embedded in the universe-SMS—the physical universe generating outcomes of experiments—by a partial function so that the scientist’s claim set (interpreted as brain states) are a subset of the universe’s claim sets.

Further, while evidence in mathematics involves chains of proof-like reasoning, evidence in science involves chains of brain states of the scientist that correspond to experimental outcomes or observations. This abstraction reflects the fact that scientists’ interpretations often depend on theories they have previously committed to. Then a scientist’s brain state, or set of claims, is mapped to by the collection of all physical processes in the universe that might have led to it (the preimage of the embedding function); for example, a brain state where the scientist recalls some facts would be mapped to by a set of processes that includes reading the textbook that contains the fact, the publication of that textbook, etc.

The SMS framework is applied to establish the benefit of two heuristics that are not compatible with Bayesian epistemology: stronger belief in a hypothesis with multiple lines of reasoning, and abductive inference.


Atomic notes


Key terms

  • Stochastic mathematical system (SMS) = “stochastic processes that generate pairs of questions and associated answers, with no explicit referents.”
    • Referent = the thing being signified in a “reference” relationship between symbol and object.
  • Abduction = using the explanatory power of a claim to infer its probability of being correct.

Reading notes

Motivation

  • Mathematicians and scientists acquire knowledge by mapping a set of “almost inconvertibly true” propositions to beliefs about propositions that are not yet settled.
  • As the mathematician or scientist conducts research, the set of established facts and the set of beliefs evolves over time. Further, their reasoning about questions they investigate and explanations they provide is a stochastic process.
  • Real mathematicians and scientists are not limited to a single formal system, such as Bayesian probabilistic reasoning, when forming the propositions they are investigating both “settled facts” and “unsettled propositions” are represented as question-answer pairs without a specified formal system.

What are the differences in formalizing “established facts,” “propositions they are investigating,” and “set of beliefs about non-settled facts”?

Assumptions

  • Mathematics is what mathematicians do.” When using the SMS model to represent mathematics itself, this implies that there is no unique answer to a mathematical question; instead, there is a “non-degenerate objective distribution over possible answers.”

What is a non-degenerate distribution (support of the same dimension)? How is it related to the concept of an isomorphism for non-degenerate functions?

  • The physical universe is a mathematical object.
  • sequences of claims are iteratively generated by a discrete stochastic process

Stochastic mathematical systems

Model fundamentals

  • Settled facts and unsettled propositions are represented as question-answer pairs, which are called claims.
  • The evolution of distributions over question-answer pairs is modeled using a stochastic mathematical system (SMS), which generates successive sets of question-answer pairs.

Definition: Questions, answers, claim (p. 7)

  • Let be arbitrary sets. An element is a question, and an element is an answer.
  • A claim is an arbitrary pair in . A claim vector is a finite sequence of claims. A claim set is a finite unordered set of claims.

Definition: Stochastic mathematical system, step (p. 7)

A stochastic mathematical system (SMS) is a pair , where is a claim vector probability space and is a measurable function that takes positive integers and maps them to a sequence of random variables (i.e., measurable functions) in the set of all claim vectors . Note that the SMS output is a claim set.

The step of the SMS is the integer argument of .

Explicitly, what are the inputs and outputs?

Important distributions

  • Agents’ epistemic positions—that is, belief in possible answers to a given question—are represented as the distribution over possible answers conditioned on their associated questions and a particular set of established claims.
    • Notice that for established facts, the conditional probably of the answer given the question is one.

Definition: Response distribution, sure (p. 10)

For an SMS , step , question , and claim set where , the associated response distribution is the the map that sends any to the value

In plain English, given a question and a particular SMS that produces a claim set that contains at step , the response distribution is a distribution over the set of possible answers for , which are produced by that SMS in response to at step .

The output value is sure if and only if it is a delta function about some answer .

Calibration

  • The full definition of calibration is different from simply requiring minimal divergence because calibration requires that the answers of the individual are, on average, very close to the answers from a ground truth “oracle.”

Mathematics

  • The ground truth oracle is interpreted as the answers given by a far-future community of mathematicians, and the universe-SMS is the process that all mathematicians over all time ask and answer questions.
  • A mathematician-SMS answers questions correctly to the extent that the answers are sampled from a response distribution that is minimally divergent from the limit response distribution of , where both response distributions are conditioned on the same claim set.
    • Asking “what is the response to this question?” is operationally the same as asking “what is the answer that an oracle would give to this question?”.

Science

  • The oracle is interpreted as the experimental outcomes generated by the physical universe. A simple deterministic universe is modeled by a universe-SMS such that the response distribution over all answers to any question and any prior claim set is sure.
  • A scientist-SMS generates cognitive events about some physical observations (i.e., question-answer pairs); in fact, we can interpret the response distribution as patterns in the physical brain.
  • In contrast to the independent models in the mathematics case, the scientist-SMS is embedded in a universe-SMS. This means that the response distribution for is given by a distribution over universe states that “project down” to the scientist’s brain patterns (see p. 21 for examples).
    • The claim set is “physical property of the scientist’s own brain, corresponding to their mental perception of the outcomes of chains of experiments or observations they have made.”