The First Filter
The First Filter
The First Filter
Conjecture
Nov 26, 2022
Consistently optimizing for solving alignment (or any other difficult problem) is incredibly hard.
The first and most obvious obstacle is that you need to actually care about alignment and feel responsible for solving it. You cannot just ignore it or pass the buck; you need to aim for it.
If you care, you now have to go beyond the traditions you were raised in. Be willing to go beyond the tools that you were given, and to use them in inappropriate and weird ways. This is where most people who care about alignment tend to fail — they tackle it like a normal problem from a classical field of science and not an incredibly hard and epistemologically fraught problem.
If you manage to transcend your methodological upbringing, you might come up with a different, fitter approach to attack the problem — your own weird inside view. Yet beware becoming a slave to your own insight, a prisoner to your own frame; it’s far too easy to never look back and just settle in your new tradition.
If you cross all these obstacles, then whatever you do, even if it is not enough, you will be one of the few who adapt, who update, who course-correct again and again. Whatever the critics, you’ll actually be doing your best.
This is the first filter. This is the first hard and crucial step to solve alignment: actually optimizing for solving the problem.
When we criticize each other in good faith about our approaches to alignment, we are acknowledging that we are not wedded to any approach or tradition. That we’re both optimizing to solve the problem. This is a mutual acknowledgement that we have both passed the first filter.
Such criticism should thus be taken as a strong compliment: your interlocutor recognizes that you are actually trying to solve alignment and open to changing your ways.
Consistently optimizing for solving alignment (or any other difficult problem) is incredibly hard.
The first and most obvious obstacle is that you need to actually care about alignment and feel responsible for solving it. You cannot just ignore it or pass the buck; you need to aim for it.
If you care, you now have to go beyond the traditions you were raised in. Be willing to go beyond the tools that you were given, and to use them in inappropriate and weird ways. This is where most people who care about alignment tend to fail — they tackle it like a normal problem from a classical field of science and not an incredibly hard and epistemologically fraught problem.
If you manage to transcend your methodological upbringing, you might come up with a different, fitter approach to attack the problem — your own weird inside view. Yet beware becoming a slave to your own insight, a prisoner to your own frame; it’s far too easy to never look back and just settle in your new tradition.
If you cross all these obstacles, then whatever you do, even if it is not enough, you will be one of the few who adapt, who update, who course-correct again and again. Whatever the critics, you’ll actually be doing your best.
This is the first filter. This is the first hard and crucial step to solve alignment: actually optimizing for solving the problem.
When we criticize each other in good faith about our approaches to alignment, we are acknowledging that we are not wedded to any approach or tradition. That we’re both optimizing to solve the problem. This is a mutual acknowledgement that we have both passed the first filter.
Such criticism should thus be taken as a strong compliment: your interlocutor recognizes that you are actually trying to solve alignment and open to changing your ways.
Consistently optimizing for solving alignment (or any other difficult problem) is incredibly hard.
The first and most obvious obstacle is that you need to actually care about alignment and feel responsible for solving it. You cannot just ignore it or pass the buck; you need to aim for it.
If you care, you now have to go beyond the traditions you were raised in. Be willing to go beyond the tools that you were given, and to use them in inappropriate and weird ways. This is where most people who care about alignment tend to fail — they tackle it like a normal problem from a classical field of science and not an incredibly hard and epistemologically fraught problem.
If you manage to transcend your methodological upbringing, you might come up with a different, fitter approach to attack the problem — your own weird inside view. Yet beware becoming a slave to your own insight, a prisoner to your own frame; it’s far too easy to never look back and just settle in your new tradition.
If you cross all these obstacles, then whatever you do, even if it is not enough, you will be one of the few who adapt, who update, who course-correct again and again. Whatever the critics, you’ll actually be doing your best.
This is the first filter. This is the first hard and crucial step to solve alignment: actually optimizing for solving the problem.
When we criticize each other in good faith about our approaches to alignment, we are acknowledging that we are not wedded to any approach or tradition. That we’re both optimizing to solve the problem. This is a mutual acknowledgement that we have both passed the first filter.
Such criticism should thus be taken as a strong compliment: your interlocutor recognizes that you are actually trying to solve alignment and open to changing your ways.
Latest Articles
Feb 24, 2024
Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes
Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes
The following are the summary and transcript of a discussion between Paul Christiano (ARC) and Gabriel Alfour, hereafter GA (Conjecture), which took place on December 11, 2022 on Slack. It was held as part of a series of discussions between Conjecture and people from other organizations in the AGI and alignment field. See our retrospective on the Discussions for more information about the project and the format.
Feb 15, 2024
Conjecture: 2 Years
Conjecture: 2 Years
It has been 2 years since a group of hackers and idealists from across the globe gathered into a tiny, oxygen-deprived coworking space in downtown London with one goal in mind: Make the future go well, for everybody. And so, Conjecture was born.
Oct 13, 2023
Multinational AGI Consortium (MAGIC): A Proposal for International Coordination on AI
Multinational AGI Consortium (MAGIC): A Proposal for International Coordination on AI
This paper proposes a Multinational Artificial General Intelligence Consortium (MAGIC) to mitigate existential risks from advanced artificial intelligence (AI). MAGIC would be the only institution in the world permitted to develop advanced AI, enforced through a global moratorium by its signatory members on all other advanced AI development.
Sign up to receive our newsletter and
updates on products and services.
Sign up to receive our newsletter and updates on products and services.
Sign up to receive our newsletter and updates on products and services.
Sign Up