Published on January 11, 2019 by

The paper “Scalable agent alignment via reward modeling: a research direction” is available here:
1. https://arxiv.org/abs/1811.07871
2. https://medium.com/@deepmindsafetyresearch/scalable-agent-alignment-via-reward-modeling-bf4ab06dfd84

Pick up cool perks on our Patreon page:
› https://www.patreon.com/TwoMinutePapers

We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
313V, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Dennis Abts, Emmanuel, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, Javier Bustamante, John De Witt, Kaiesh Vohra, Kjartan Olason, Lorin Atzberger, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Morten Punnerud Engelstad, Nader Shakerin, Owen Skarpness, Raul Araújo da Silva, Richard Reis, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Thomas Krcmar, Torsten Reil, Zach Boldyga, Zach Doty.
https://www.patreon.com/TwoMinutePapers

Crypto and PayPal links are available below. Thank you very much for your generous support!
› PayPal: https://www.paypal.me/TwoMinutePapers
› Bitcoin: 1a5ttKiVQiDcr9j8JT2DoHGzLG7XTJccX
› Ethereum: 0xbBD767C0e14be1886c6610bf3F592A91D866d380
› LTC: LM8AUh5bGcNgzq6HaV1jeaJrFvmKxxgiXg

Thumbnail background image credit: https://pixabay.com/photo-3706562/
Splash screen/thumbnail design: Felícia Fehér – http://felicia.hu

Károly Zsolnai-Fehér’s links:
Facebook: https://www.facebook.com/TwoMinutePapers/
Twitter: https://twitter.com/karoly_zsolnai
Web: https://cg.tuwien.ac.at/~zsolnai/

Category Tag