Product

Product

Product

Product

The First Filter

The First Filter

Title

Title

The First Filter

The First Filter

The First Filter

Conjecture

Nov 26, 2022

Consistently optimizing for solving alignment (or any other difficult problem) is incredibly hard.

The first and most obvious obstacle is that you need to actually care about alignment and feel responsible for solving it. You cannot just ignore it or pass the buck; you need to aim for it.

If you care, you now have to go beyond the traditions you were raised in. Be willing to go beyond the tools that you were given, and to use them in inappropriate and weird ways. This is where most people who care about alignment tend to fail — they tackle it like a normal problem from a classical field of science and not an incredibly hard and epistemologically fraught problem.

If you manage to transcend your methodological upbringing, you might come up with a different, fitter approach to attack the problem — your own weird inside view. Yet beware becoming a slave to your own insight, a prisoner to your own frame; it’s far too easy to never look back and just settle in your new tradition.

If you cross all these obstacles, then whatever you do, even if it is not enough, you will be one of the few who adapt, who update, who course-correct again and again. Whatever the critics, you’ll actually be doing your best.

This is the first filter. This is the first hard and crucial step to solve alignment: actually optimizing for solving the problem.

When we criticize each other in good faith about our approaches to alignment, we are acknowledging that we are not wedded to any approach or tradition. That we’re both optimizing to solve the problem. This is a mutual acknowledgement that we have both passed the first filter.

Such criticism should thus be taken as a strong compliment: your interlocutor recognizes that you are actually trying to solve alignment and open to changing your ways.

Consistently optimizing for solving alignment (or any other difficult problem) is incredibly hard.

The first and most obvious obstacle is that you need to actually care about alignment and feel responsible for solving it. You cannot just ignore it or pass the buck; you need to aim for it.

If you care, you now have to go beyond the traditions you were raised in. Be willing to go beyond the tools that you were given, and to use them in inappropriate and weird ways. This is where most people who care about alignment tend to fail — they tackle it like a normal problem from a classical field of science and not an incredibly hard and epistemologically fraught problem.

If you manage to transcend your methodological upbringing, you might come up with a different, fitter approach to attack the problem — your own weird inside view. Yet beware becoming a slave to your own insight, a prisoner to your own frame; it’s far too easy to never look back and just settle in your new tradition.

If you cross all these obstacles, then whatever you do, even if it is not enough, you will be one of the few who adapt, who update, who course-correct again and again. Whatever the critics, you’ll actually be doing your best.

This is the first filter. This is the first hard and crucial step to solve alignment: actually optimizing for solving the problem.

When we criticize each other in good faith about our approaches to alignment, we are acknowledging that we are not wedded to any approach or tradition. That we’re both optimizing to solve the problem. This is a mutual acknowledgement that we have both passed the first filter.

Such criticism should thus be taken as a strong compliment: your interlocutor recognizes that you are actually trying to solve alignment and open to changing your ways.

Consistently optimizing for solving alignment (or any other difficult problem) is incredibly hard.

The first and most obvious obstacle is that you need to actually care about alignment and feel responsible for solving it. You cannot just ignore it or pass the buck; you need to aim for it.

If you care, you now have to go beyond the traditions you were raised in. Be willing to go beyond the tools that you were given, and to use them in inappropriate and weird ways. This is where most people who care about alignment tend to fail — they tackle it like a normal problem from a classical field of science and not an incredibly hard and epistemologically fraught problem.

If you manage to transcend your methodological upbringing, you might come up with a different, fitter approach to attack the problem — your own weird inside view. Yet beware becoming a slave to your own insight, a prisoner to your own frame; it’s far too easy to never look back and just settle in your new tradition.

If you cross all these obstacles, then whatever you do, even if it is not enough, you will be one of the few who adapt, who update, who course-correct again and again. Whatever the critics, you’ll actually be doing your best.

This is the first filter. This is the first hard and crucial step to solve alignment: actually optimizing for solving the problem.

When we criticize each other in good faith about our approaches to alignment, we are acknowledging that we are not wedded to any approach or tradition. That we’re both optimizing to solve the problem. This is a mutual acknowledgement that we have both passed the first filter.

Such criticism should thus be taken as a strong compliment: your interlocutor recognizes that you are actually trying to solve alignment and open to changing your ways.

Latest Articles

Dec 2, 2024

Conjecture: A Roadmap for Cognitive Software and A Humanist Future of AI

Conjecture: A Roadmap for Cognitive Software and A Humanist Future of AI

An overview of Conjecture's approach to "Cognitive Software," and our build path towards a good future.

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Feb 24, 2024

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

The following are the summary and transcript of a discussion between Paul Christiano (ARC) and Gabriel Alfour, hereafter GA (Conjecture), which took place on December 11, 2022 on Slack. It was held as part of a series of discussions between Conjecture and people from other organizations in the AGI and alignment field. See our retrospective on the Discussions for more information about the project and the format.

Feb 15, 2024

Conjecture: 2 Years

Conjecture: 2 Years

It has been 2 years since a group of hackers and idealists from across the globe gathered into a tiny, oxygen-deprived coworking space in downtown London with one goal in mind: Make the future go well, for everybody. And so, Conjecture was born.

Sign up to receive our newsletter and
updates on products and services.

Sign up to receive our newsletter and updates on products and services.

Sign up to receive our newsletter and updates on products and services.

Sign Up

Company

Resources

Information Hazard Policy

Information Hazard Policy