HAI Weekly Seminar with Vael Gates

Researcher Perceptions of Current and Future AI

Artificial intelligence (AI) has enormous potential for both positive and negative impact, especially as we move from current-day systems towards more capable systems in the future. However, as a society we lack an understanding of how the developers of this technology, AI researchers, perceive the benefits and risks of their work, both in today's systems and impacts in the future. In this talk, Gates will present results from over 70 interviews with AI researchers, asking questions ranging from "What do you think are the largest benefits and risks of AI?" to "If you could change your colleagues’ perception of AI, what attitudes/beliefs would you want them to have?"

READINGS:

“The case for taking AI seriously as a threat to humanity” by Kelsey Piper (Vox)
Human-Compatible, by Stuart Russell
The Alignment Problem, by Brian Christian
The Precipice: Existential Risk and the Future of Humanity, by Toby Ord
The Most Important Century, specifically "Forecasting Transformative AI", by Holden Karnofsky

TECHNICAL READINGS:

Empirical work by DeepMind's Safety team on alignment
Empirical work by Anthropic on alignment
Talk (and transcript) by Paul Christiano describing the AI alignment landscape in 2020
Podcast (and transcript) by Rohin Shah, describing the state of AI value alignment in 2021
Alignment Newsletter and ML Safety Newsletter
Unsolved Problems in ML Safety by Hendrycks et al. (2022)
Alignment Research Center
Interpretability work aimed at alignment: Elhage et al. (2021) and Olah et al. (2020)
AI Safety Resources by Victoria Krakovna (DeepMind) and Technical Alignment Curriculum

FUNDING:

Open Philanthropy Graduate Student Fellowship
Open Philanthropy Faculty Fellowship (faculty and others can reach out to OpenPhil directly as well)
FTX Future Fund
Long-Term Future Fund

STANFORD RESOURCES:

Contact Vael Gates at vlgates@stanford.edu for further questions or collaboration inquiries.

Speaker

Vael Gates

HAI Network Affiliate

Vael received their Ph.D. in Neuroscience (Computational Cognitive Science) from UC Berkeley in 2021. During their Ph.D. they worked on formalizing and testing computational cognitive models of social collaboration. Their Ph.D.

No tweets available.

Navigate

Participate

HAI Weekly Seminar with Vael Gates

Related Events

Robotics in a Human-Centered World: Innovations and Implications

Robotics in a Human-Centered World: Innovations and Implications

Lowry Pressly | Privacy and the Power of Unknowing

Lowry Pressly | Privacy and the Power of Unknowing

Stanford AI Index 2025 Report | Implications for Workforce and Beyond

Stanford AI Index 2025 Report | Implications for Workforce and Beyond

Researcher Perceptions of Current and Future AI

Speaker