The Fund for Alignment Research is hiring research engineers to help with AI safety projects in reinforcement learning, natural language processing, and adversarial robustness.
We are Adam Gleave and Scott Emmons, two artificial intelligence (AI) safety researchers working with the Center for Human-Compatible AI at UC Berkeley. We aim to reduce existential risk from transformative AI systems, and we are looking for research engineers to collaborate with us on our current projects.
Broadly speaking, our research is focused on how AI can learn about humans’ goals and then robustly help humans achieve them. Our expertise is in reinforcement learning (RL), one of the most general frameworks for building transformative AI. In our prior work, we have developed ways to measure the goals of RL agents and teach a single agent to achieve many different goals. We have also identified a new threat model for attacking RL systems and developed a method to make RL systems more robust.
About the Role
You will collaborate closely with Adam and/or Scott on their research projects. As a research engineer, you will develop scalable implementations of machine learning algorithms and use them to run scientific experiments. You will be involved in the write-up of results and credited as an author in submissions to peer-reviewed venues (e.g. NeurIPS, ICLR, JMLR).
While each of our projects is unique, your role will generally have:
- Flexibility. You will focus on research engineering but contribute to all aspects of the research project. We expect everyone on the project to help shape the research direction, analyse experimental results, and participate in the write-up of results.
- Variety. You will work on a project that uses a range of technical approaches to solve a problem. You will also have the opportunity to contribute to different research agendas and projects over time.
- Collaboration. You will be regularly working with our collaborators from different academic labs and research institutions.
- Mentorship. You will develop your research taste through regular project meetings and develop your programming style through code reviews.
- Autonomy. You will be highly self-directed. To succeed in the role, you will likely need to spend part of your time studying machine learning and developing your high-level views on AI safety research.
You will be a contractor for the Fund for Alignment Research, a project of the 501(c)(3) Players Philanthropy Fund. The Fund for Alignment Research is advised by Ethan Perez, Scott Emmons, and Adam Gleave.
We are looking to offer contracts with the following details:
- Location: Both remote and in-person (Berkeley, CA) are possible.
- Hours: Full-time preferred, part-time possible (minimum 20 hours/week).
- Contract length: 6-12 months with possibility for renewal.
- Compensation: $50-$100/hour depending on experience and location. We will also pay for work-related travel and equipment expenses.
This role would be a good fit for someone looking to gain hands-on experience with machine learning engineering while testing their personal fit for AI safety research. We imagine interested applicants might be looking to grow an existing portfolio of machine learning research or looking to transition to AI safety research from a software engineering background.
It is essential that you:
- Have significant software engineering experience or experience applying machine learning methods. Evidence of this may include prior work experience, open-source contributions, or academic publications.
- Have experience with at least one object-oriented programming language (preferably Python).
- Are results-oriented and motivated by impactful research.
It is preferable that you have experience with some of the following:
- Common ML frameworks like PyTorch or TensorFlow.
- Natural language processing or reinforcement learning.
- Operating system internals and distributed systems.
- Publications or open-source software contributions.
- Basic linear algebra, calculus, vector probability, and statistics.
About the Projects
As a research engineer, you would collaborate on a project about one of the following topics:
- Reward and imitation learning. Developing a reliable set of baseline implementations for algorithms that can learn from human feedback. Extensions may include developing standardised benchmark environments, datasets and evaluation procedures.
- Natural language processing. Integrating large language models with reinforcement learning in order to better understand human intentions. Creating text datasets and using them to fine-tune large language models.
- Adversarial robustness. Applying reinforcement learning techniques to test for vulnerabilities in narrowly superhuman systems such as KataGo.