HAI Weekly Seminar with Vael Gates
Event Details
Event Type
Location
Virtual
Researcher Perceptions of Current and Future AI
Artificial intelligence (AI) has enormous potential for both positive and negative impact, especially as we move from current-day systems towards more capable systems in the future. However, as a society we lack an understanding of how the developers of this technology, AI researchers, perceive the benefits and risks of their work, both in today's systems and impacts in the future. In this talk, Gates will present results from over 70 interviews with AI researchers, asking questions ranging from "What do you think are the largest benefits and risks of AI?" to "If you could change your colleagues’ perception of AI, what attitudes/beliefs would you want them to have?"
READINGS:
- “The case for taking AI seriously as a threat to humanity” by Kelsey Piper (Vox)
- Human-Compatible, by Stuart Russell
- The Alignment Problem, by Brian Christian
- The Precipice: Existential Risk and the Future of Humanity, by Toby Ord
- The Most Important Century, specifically "Forecasting Transformative AI", by Holden Karnofsky
TECHNICAL READINGS:
- Empirical work by DeepMind's Safety team on alignment
- Empirical work by Anthropic on alignment
- Talk (and transcript) by Paul Christiano describing the AI alignment landscape in 2020
- Podcast (and transcript) by Rohin Shah, describing the state of AI value alignment in 2021
- Alignment Newsletter and ML Safety Newsletter
- Unsolved Problems in ML Safety by Hendrycks et al. (2022)
- Alignment Research Center
- Interpretability work aimed at alignment: Elhage et al. (2021) and Olah et al. (2020)
- AI Safety Resources by Victoria Krakovna (DeepMind) and Technical Alignment Curriculum
FUNDING:
- Open Philanthropy Graduate Student Fellowship
- Open Philanthropy Faculty Fellowship (faculty and others can reach out to OpenPhil directly as well)
- FTX Future Fund
- Long-Term Future Fund
STANFORD RESOURCES:
Contact Vael Gates at vlgates@stanford.edu for further questions or collaboration inquiries.
Speaker
Vael Gates
HAI Network Affiliate
Vael received their Ph.D. in Neuroscience (Computational Cognitive Science) from UC Berkeley in 2021. During their Ph.D. they worked on formalizing and testing computational cognitive models of social collaboration. Their Ph.D.