1 private link
The full paper is available.
Abstract:
Developing a general algorithm that learns to solve tasks across a wide range of applications has been a fundamental challenge in artificial intelligence. Although current reinforcement-learning algorithms can be readily applied to tasks similar to what they have been developed for, configuring them for new application domains requires substantial human expertise and experimentation1,2. Here we present the third generation of Dreamer, a general algorithm that outperforms specialized methods across over 150 diverse tasks, with a single configuration. Dreamer learns a model of the environment and improves its behaviour by imagining future scenarios. Robustness techniques based on normalization, balancing and transformations enable stable learning across domains. Applied out of the box, Dreamer is, to our knowledge, the first algorithm to collect diamonds in Minecraft from scratch without human data or curricula. This achievement has been posed as a substantial challenge in artificial intelligence that requires exploring farsighted strategies from pixels and sparse rewards in an open world3. Our work allows solving challenging control problems without extensive experimentation, making reinforcement learning broadly applicable.
At best, making oncall the exclusive responsibility of an elite SRE class increases our tolerance for complexity.
See Simplicity.
Oncall is a form of toil – it needs to be done but it doesn’t leave our systems in a better state.
Stakeholders see high-profile incident response/oncall happening, and don’t demand clarity on what other work the group is undertaking.
To go further – incident command and management is a specific set of skills that you can definitely be good at, and where the business really, really needs a consistent and competent response, every time. At Twilio, we have a specific team that manages all incidents, follow-up actions, and operational insights around incidents company-wide. We’ve found that making sure that the data and insights around incidents and their followup flows back into the business is a full-time job. Relying on a rotation of variably interested volunteers to ensure this happens will get you mixed results.
It will be useful to have Chapter 11 (Being On-Call) from Google's SRE book available (it's one in a series).