Human decision processes are not well factored

A classic example of human bias is when our political values interfere with our ability to accept data or policies from people we perceive as opponents. When most people feel like new evidence threatens their values, their first instincts are often to deny or subject this evidence to more scrutiny instead of openly considering it. Such reactions are quite common when challenged: it takes active effort not to react purely defensively and consider the critic’s models, even when they are right.

How can we understand why this occurs? Seen from a stereotypical view of human behavior, values and preferences about the world are decoupled from our world model and beliefs. An independent decision theory uses them to guide our actions, and we use our epistemology to update our world models with new information. In this view, all these parts are all nice, independent, consistent things. This view works to some degree: we know facts about all sorts of things, from politics to biology, seem to have values about various circumstances, and so on. Cases like political values interfering with our ability to update our beliefs are then modeled as irrational noise on top of this clean model of human behavior. If we can eliminate this noise and get ourselves to overcome our irrational urges, we can (in principle) behave rationally.

But to make this model of factored rationality work, we need a lot of corrections to account for all of the cognitive biases that humans display. Each one adds more parameters and noise, and there are a lot of them! It should make us suspicious that we are taking a model of the world, and each time we come across something that contradicts that view, we just add more noise and parameters to it.

Another potential model that avoids this ad-hoc addition of noise is seeing this unwillingness to update as something more fundamental: our values, beliefs, and decision theory are entangled and do not exist independently.

Consequences of taking this model seriously include:

Changing beliefs can change values and vice versa, making us more resistant to updating
To have sharp beliefs and values, we must actively implement them. This does not happen by default
Even after we implement a belief or value, and a decision theory around it, it is still local. The implementation may still clash with other parts of the messy processes driving our behavior
Implementing values and beliefs isn’t free and takes time and effort to do well, so we need to decide when this is worthwhile

To continue the example of accepting critical feedback, say I discuss an idea I have of a tool to build with one of my colleagues. He pushes me on a few practical details: I need to be more concrete about the use case, if there are better ways to do it, and if this is the best use of my time. But in its original form, my idea wasn’t a clean set of claims about the world, which I use to make decisions about what to build. Instead, they got tangled with my values: I like my ideas; they are mine, after all. If I put in the work to untangle my model of reality with these emotions and values, accepting that it might feel bad, I can more directly apply the evidence and models he presents to my idea.

In this model, humans do not have cleanly separated values, world models, and decision theory! One method of dealing with this is explicitly implementing locally consistent beliefs and values and a decision theory based on them. This implementation is limited: it is, at best, locally consistent and takes time and energy to create.

A classic example of human bias is when our political values interfere with our ability to accept data or policies from people we perceive as opponents. When most people feel like new evidence threatens their values, their first instincts are often to deny or subject this evidence to more scrutiny instead of openly considering it. Such reactions are quite common when challenged: it takes active effort not to react purely defensively and consider the critic’s models, even when they are right.

How can we understand why this occurs? Seen from a stereotypical view of human behavior, values and preferences about the world are decoupled from our world model and beliefs. An independent decision theory uses them to guide our actions, and we use our epistemology to update our world models with new information. In this view, all these parts are all nice, independent, consistent things. This view works to some degree: we know facts about all sorts of things, from politics to biology, seem to have values about various circumstances, and so on. Cases like political values interfering with our ability to update our beliefs are then modeled as irrational noise on top of this clean model of human behavior. If we can eliminate this noise and get ourselves to overcome our irrational urges, we can (in principle) behave rationally.

But to make this model of factored rationality work, we need a lot of corrections to account for all of the cognitive biases that humans display. Each one adds more parameters and noise, and there are a lot of them! It should make us suspicious that we are taking a model of the world, and each time we come across something that contradicts that view, we just add more noise and parameters to it.

Another potential model that avoids this ad-hoc addition of noise is seeing this unwillingness to update as something more fundamental: our values, beliefs, and decision theory are entangled and do not exist independently.

Consequences of taking this model seriously include:

Changing beliefs can change values and vice versa, making us more resistant to updating
To have sharp beliefs and values, we must actively implement them. This does not happen by default
Even after we implement a belief or value, and a decision theory around it, it is still local. The implementation may still clash with other parts of the messy processes driving our behavior
Implementing values and beliefs isn’t free and takes time and effort to do well, so we need to decide when this is worthwhile

To continue the example of accepting critical feedback, say I discuss an idea I have of a tool to build with one of my colleagues. He pushes me on a few practical details: I need to be more concrete about the use case, if there are better ways to do it, and if this is the best use of my time. But in its original form, my idea wasn’t a clean set of claims about the world, which I use to make decisions about what to build. Instead, they got tangled with my values: I like my ideas; they are mine, after all. If I put in the work to untangle my model of reality with these emotions and values, accepting that it might feel bad, I can more directly apply the evidence and models he presents to my idea.

In this model, humans do not have cleanly separated values, world models, and decision theory! One method of dealing with this is explicitly implementing locally consistent beliefs and values and a decision theory based on them. This implementation is limited: it is, at best, locally consistent and takes time and energy to create.

A classic example of human bias is when our political values interfere with our ability to accept data or policies from people we perceive as opponents. When most people feel like new evidence threatens their values, their first instincts are often to deny or subject this evidence to more scrutiny instead of openly considering it. Such reactions are quite common when challenged: it takes active effort not to react purely defensively and consider the critic’s models, even when they are right.

How can we understand why this occurs? Seen from a stereotypical view of human behavior, values and preferences about the world are decoupled from our world model and beliefs. An independent decision theory uses them to guide our actions, and we use our epistemology to update our world models with new information. In this view, all these parts are all nice, independent, consistent things. This view works to some degree: we know facts about all sorts of things, from politics to biology, seem to have values about various circumstances, and so on. Cases like political values interfering with our ability to update our beliefs are then modeled as irrational noise on top of this clean model of human behavior. If we can eliminate this noise and get ourselves to overcome our irrational urges, we can (in principle) behave rationally.

But to make this model of factored rationality work, we need a lot of corrections to account for all of the cognitive biases that humans display. Each one adds more parameters and noise, and there are a lot of them! It should make us suspicious that we are taking a model of the world, and each time we come across something that contradicts that view, we just add more noise and parameters to it.

Another potential model that avoids this ad-hoc addition of noise is seeing this unwillingness to update as something more fundamental: our values, beliefs, and decision theory are entangled and do not exist independently.

Consequences of taking this model seriously include:

Changing beliefs can change values and vice versa, making us more resistant to updating
To have sharp beliefs and values, we must actively implement them. This does not happen by default
Even after we implement a belief or value, and a decision theory around it, it is still local. The implementation may still clash with other parts of the messy processes driving our behavior
Implementing values and beliefs isn’t free and takes time and effort to do well, so we need to decide when this is worthwhile

To continue the example of accepting critical feedback, say I discuss an idea I have of a tool to build with one of my colleagues. He pushes me on a few practical details: I need to be more concrete about the use case, if there are better ways to do it, and if this is the best use of my time. But in its original form, my idea wasn’t a clean set of claims about the world, which I use to make decisions about what to build. Instead, they got tangled with my values: I like my ideas; they are mine, after all. If I put in the work to untangle my model of reality with these emotions and values, accepting that it might feel bad, I can more directly apply the evidence and models he presents to my idea.

In this model, humans do not have cleanly separated values, world models, and decision theory! One method of dealing with this is explicitly implementing locally consistent beliefs and values and a decision theory based on them. This implementation is limited: it is, at best, locally consistent and takes time and energy to create.

Human decision processes are not well factored

Human decision processes are not well factored

Human decision processes are not well factored

Latest Articles

Conjecture: A Roadmap for Cognitive Software and A Humanist Future of AI

Conjecture: A Roadmap for Cognitive Software and A Humanist Future of AI

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Conjecture: 2 Years

Conjecture: 2 Years

Sign up to receive our newsletter and
updates on products and services.

Company

Resources

Human decision processes are not well factored

Human decision processes are not well factored

Human decision processes are not well factored

Latest Articles

Conjecture: A Roadmap for Cognitive Software and A Humanist Future of AI

Conjecture: A Roadmap for Cognitive Software and A Humanist Future of AI

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Conjecture: 2 Years

Conjecture: 2 Years

Sign up to receive our newsletter andupdates on products and services.

Company

Resources

Sign up to receive our newsletter and
updates on products and services.