Your Brain Doesn’t Have “Present Bias”, It Has Optimal Commitment Timing
TL;DR
Present bias might be optimal computation instead of irrationality. The brain uses similar machinery for “$80 now vs. $100 later” as for motor planning. Motor control solved this decades ago: maintain options until maintenance cost exceeds information value. Four testable predictions: (1) preference reversals track uncertainty resolution timing, not just time passing; (2) option value follows log(N) from information theory; (3) discount rates scale with cognitive complexity; (4) forcing longer planning horizons kills most “present bias.” If even one prediction fails cleanly, theory is wrong. If they all hold, behavioral economics has been misinterpreting data for 40 years.
The thing that never made sense
People choose $80 today over $100 in a month, then turn around and choose $100 in 13 months over $80 in 12 months. This preference reversal has been documented thousands of times. The standard explanation: “present bias” or competing neural systems (hot vs. cold, System 1 vs. System 2), as if the brain contains distinct homunculi fighting for control.
There’s a simpler explanation hiding in plain sight, in a completely different literature: motor control theory. Once you see the connection, what behavioral economics calls “irrational” behavior looks like optimal decision-making under computational constraints.
Motor control nerds already cracked this
Over the past 30 years, computational neuroscience reverse-engineered how the brain plans and executes movements. The core insights:
1. The brain maintains multiple motor plans simultaneously
When reaching for a cup, your motor cortex doesn’t commit to a single trajectory immediately. It maintains probability distributions over several possible paths, updating these distributions as sensory information arrives. Commitment is delayed until the marginal value of information falls below the marginal cost of maintaining options.
Why? Because early commitment under uncertainty is expensive. If you commit to a plan and then receive information that makes it suboptimal, correction costs are high (energy and time). Better to hedge your bets until the last responsible moment.
2. Effort costs scale with force squared and duration
The metabolic cost of muscle activation follows a predictable function: approximately proportional to duration × force². This falls directly out of ATP consumption and muscle fiber recruitment biophysics.
3. The brain only corrects deviations that matter
The “minimum intervention principle” says your motor system tolerates variability in dimensions that don’t affect task success. Reaching for a large cup, slight deviations in hand orientation don’t trigger corrections. Threading a needle, they do. The system is tuned to what matters.
4. Uncertainty is explicitly represented and prospectively calculated
Your motor system doesn’t just react to sensory feedback. Instead, it maintains uncertainty estimates and uses them to plan. High uncertainty = more resources to maintaining multiple plans. Low uncertainty = commit earlier.^[These principles are well-established in motor control: optimal feedback control and minimum intervention (Todorov & Jordan 2002; Scott 2004), affordance competition and late commitment (Cisek 2007; collapsing decision bounds in Shadlen et al.), and vigor/average reward rate (Niv et al. 2007). The math is worked out—just applying it to economic choice.]
The Hypothesis: Same Objective, Shared Control Principles
Core claim: The computational machinery that solves intertemporal choice uses the same optimization principles and overlapping control circuits as motor planning. Whether it’s literally the same neural circuits is an empirical question, but at minimum, they’re solving the same computational problem under the same physical constraints.
This predicts “irrational” economic behavior should follow the exact same patterns as “rational” motor behavior. And it does.
Prediction 1: Preference Reversals Are About Uncertainty, Not Time
The standard story: You reverse preferences as the immediate reward gets temporally closer because your “hot” system takes over.
The motor control story: You reverse preferences when uncertainty about the delayed reward remains high relative to uncertainty about the immediate reward. Time is just a proxy for information arrival.
Critical test: Manipulate information structure independently of time.
- Condition A (Standard): Choose between $80 now vs. $100 in 30 days. Subject learns nothing else until day 30.
- Condition B (Early Certainty): Choose between $80 now vs. $100 in 30 days. But subjects receive legally binding confirmation at decision time that the $100 is guaranteed (signed contract, escrowed funds).
- Condition C (Late Uncertainty): Choose between $80 now vs. $100 in 30 days. But subjects learn on day 25 that there’s a 20% chance the delayed payment will fall through (budget uncertainties, whatever).
Important controls: All conditions equate expected value and risk at decision time. Condition B uses escrow to remove trust variance while maintaining identical payment timing. No subjects are cash-constrained (screened and compensated separately for liquidity needs).
Standard present-bias models predict the same reversal rate in all three conditions (time is what matters).
Motor control predicts:
- Condition B shows dramatically LESS reversal (early uncertainty reduction)
- Condition C shows MORE reversal (late uncertainty increase)
- The effect should be quantitatively predictable from the magnitude of uncertainty reduction
The effect will be large and robust.
Prediction 2: Option Value Has Specific Computational Form
Why do people pay to “keep their options open” even when there’s no instrumental value? Motor control gives the answer: maintaining multiple active plans has costs (neural resources, attention, working memory) but also value (flexibility under uncertainty).
The optimal number of simultaneously maintained plans follows from information theory. Specifically, the value of maintaining N options should follow:
Value(N options) ≈ log(N) × Uncertainty - k × N
Where k is the cognitive cost per option. The log(N) term comes from information bottleneck theory: the value grows with usable bits about alternatives. For roughly independent options with similar priors, bits scale like log N. This assumes options aren’t perfectly correlated and you have reasonably uniform uncertainty across them.
This predicts:
- Diminishing returns: 2 options >> 1 option, but 10 options ≈ 9 options (logarithmic)
- Sensitivity to cognitive load: Under load, k increases, so optimal N decreases
- Uncertainty-dependence: When future states are predictable, option value collapses
Experimental test:
Give subjects choices between:
- Option A: Commit now to X
- Option B: Pay $2 to decide later between {X, Y}
- Option C: Pay $4 to decide later between {X, Y, Z}
- Option D: Pay $6 to decide later between {X, Y, Z, W}
Manipulate uncertainty (predictability of which option will be best) and cognitive load (secondary task).
Motor control predicts a precise relationship between uncertainty level, cognitive load, and willingness to pay for additional options. Current models don’t give you the functional form (they just say “people like flexibility”).
Prediction 3: Discount Functions Reflect Information Processing Costs
If temporal discounting emerges from the same optimization principles as motor control, the functional form shouldn’t be arbitrary. Motor control research shows effort costs follow Duration × Force² because of control penalties in motor commands.
For economic choice, the cleaner story is about information processing: discount rates should scale with the bits or policy complexity required to represent and maintain delayed rewards. In motor control, quadratic penalties arise from control costs; in cognition, complexity costs might show similar scaling as a control surrogate. This predicts:
Discount strength should increase with representational and policy complexity (i.e., bits required), plausibly super-linear in tasks that impose quadratic control-like penalties.
Test: Manipulate representation complexity for delayed rewards:
- Easy condition: “$100 in 30 days” (simple representation)
- Medium condition: “$100 in 30 days, but converted to Euros at the exchange rate on that day” (more complex)
- Hard condition: “$100 in 30 days, paid in installments based on a formula tied to stock market performance” (very complex)
If complexity costs follow control-penalty logic, expect near-quadratic scaling. But even monotonic scaling with bits would support the framework. Existing models predict either no effect or weak linear effects.
Moreover: Physical fatigue should affect temporal discounting in predictable ways. Make people do exhausting physical work (depleting ATP), and watch their discount rates change. The magnitude should match predictions from metabolic cost functions in motor control.
Prediction 4: “Present Bias” Is Optimal Late Commitment
Radical claim: Present bias isn’t a bias. It’s optimal commitment timing given computational constraints.
When choosing between $80 now and $100 in 30 days, you’re not choosing between two outcomes. You’re choosing between:
- Committing now to the immediate option (low uncertainty, low optionality cost)
- Maintaining uncertainty about the delayed option for 30 days (higher optionality cost, higher cognitive load)
As you get closer to day 30, the optionality cost decreases (don’t have to maintain the plan for as long), so the delayed option becomes more attractive. This looks like “preference reversal” but it’s actually rational updating about commitment costs.
Experimental test:
Force early commitment in a way that removes the ongoing optionality cost:
- Standard condition: Choose between $80 now vs. $100 in 30 days. Decision is binding when made.
- Planning condition: On day 1, choose between two complete 30-day plans:
- Plan A: “$80 on day 1, $90 on day 15, $100 on day 30” (all delayed)
- Plan B: “$70 on day 1, $80 on day 15, $90 on day 30” (all delayed, uniformly lower)
In the planning condition, you’re not choosing between “now” and “later” (you’re choosing between two complete temporal trajectories). Motor control predicts present bias should shrink substantially because the optimization problem has changed: you’re removing ongoing option-maintenance costs by making a single trajectory choice at the outset.
Current models predict people should still show present bias for the early rewards in the sequence. Motor control predicts they won’t (because they’re now optimizing over the full trajectory with a single commitment).
What falls out of this
If the theory holds:
1. Why “present bias” is so context-dependent
It’s not a fixed preference parameter (it’s a computational tradeoff that depends on uncertainty, cognitive load, representation complexity, and planning horizon). Change any of these, and the “bias” changes.
2. Why animals show similar patterns
You don’t need to invoke culturally-specific learning or sophisticated reasoning. This is fundamental computational architecture that emerges from physical constraints on information processing.
3. Why interventions often fail
Present bias has been treated as something to overcome (commitment devices, willpower, etc.). But if it’s optimal computation, you can’t fight it directly. Change the information structure or the planning context instead.
4. The hard problem of cognitive costs
Behavioral economics typically assumes cognitive costs but doesn’t derive them. Motor control gives actual equations based on biophysics and information theory.
Things that could screw this up
Some ways the experiments could fail even if the theory is right:
Information timing vs. risk/ambiguity: The uncertainty manipulation changes when information arrives, but this also risks changing risk or ambiguity aversion. Expected values must be equated at decision time.
Trust and liquidity: Early certainty manipulations (like escrow) remove trust concerns and potentially liquidity constraints. These must be controlled independently to isolate the pure information-timing effect.
Time preference vs. opportunity cost: Delayed rewards might be discounted partly because of genuine time preference (impatience) or opportunity costs, not just computational costs of maintaining representations. The theory predicts effects above and beyond these factors.
Dual-task confounds: When manipulating cognitive load, the secondary task can’t change motivation or attention in ways that confound the interpretation.
Ecological validity: In the real world, distant rewards often ARE more uncertain (people die, organizations fail, circumstances change). So time and uncertainty are naturally confounded. The experiments artificially decouple them to test the mechanism, but the theory still needs to explain ordinary present bias as a rational response to this natural correlation.
How you’d actually test this
Phase 1: Uncertainty Timing (Most Direct Test)
Run the early certainty vs. late uncertainty experiment. This is the cleanest test because it directly manipulates the key variable (information structure) while holding everything else constant.
Prediction: Large effect (>30% reduction in preference reversals with early certainty). No effect = theory dead.
Phase 2: Forced Planning Horizons
Test whether requiring people to plan over longer horizons changes apparent present bias. This tests whether “bias” is actually about commitment timing.
Prediction: Present bias shrinks substantially (maybe 50%+) when decisions are bundled into longer planning contexts.
Phase 3: Cognitive Load × Option Value
Measure the precise functional form of option value under varying uncertainty and cognitive load. This tests whether the information-theoretic predictions hold.
Prediction: Log relationship between number of options and value, modulated by uncertainty and cognitive load.
Phase 4: Physical Fatigue × Discount Rates
Test whether metabolic state affects temporal discounting in ways predicted by motor control cost functions.
Prediction: Exhausting exercise increases discount rates, magnitude predictable from ATP depletion rates.
The Bet
I’m confident enough in this framework to make specific directional predictions:
-
Uncertainty timing effect: Early certainty should reduce preference reversals by 25-40% relative to standard conditions. The effect should be large and robust.
-
Planning horizon effect: Forced long-horizon planning should reduce apparent present bias by 35-50% compared to standard conditions. This should replicate across different reward magnitudes.
-
Functional form of option value: Option value should be better fit by logarithmic functions than linear or step functions. BIC advantage for log models should exceed 10 points.
-
Physical fatigue effect: Exhausting exercise should increase discount rates by 15-25%, with effect magnitude predictable from motor control equations (correlation between predicted and observed changes r > 0.5).
These aren’t vague qualitative predictions. They specify directions, approximate effect sizes, and functional forms. If even one of these fails to show the predicted pattern, the theory needs serious revision.^[The confident tone reflects my view that these effects are large if the theory is correct. But I’m not claiming they’re guaranteed—that’s what experiments are for. These are preregistered predictions, not post-hoc fits.]
Implications
For policy: If I’m right, nudge units can stop printing glossy brochures about long-term thinking and just restructure information timing.
For theory: Dual-process models might be solving a non-existent problem. Single optimizer, computational constraints.
For neuroscience: Testable prediction: SMA, PMd, and basal ganglia should light up during intertemporal choice with the same computational signatures as motor planning. If they’re just “also active” without matching patterns, theory is dead.
For AI: Stop bolting on arbitrary discount functions. Derive them from architectural constraints and watch the biases emerge for free.
Behavioral economics catalogued dozens of “biases.” Maybe they’re just optimal solutions to computational problems under physical constraints. Motor control already has the math. The circuits probably overlap. The “irrationality” is rationality under constraints.
Or maybe I’m massively wrong and it’s all just hyperbolic discounting. Either way, somebody should run the experiments.
Predictions are logged. If you’re equipped to test any of this and want to collaborate, reach out. If none of it replicates, at least there’ll be a clean falsification.