OpenAI achieves Continuous Agent Adaptation via Meta-Learning

From Adaptation via Meta-Learning

We’ve evolved a population of 1050 agents of different anatomies (Ant, Bug, Spider), policies (MLP, LSTM), and adaptation strategies (PPO-tracking, RL^2, meta-updates) for 10 epochs. Initially, we had an equal number of agents of each type. Every epoch, we randomly matched 1000 pairs of agents and made them compete and adapt in multi-round games against each other. The agents that lost disappeared from the population, while the winners replicated themselves.

Summary: After a few epochs of evolution, Spiders, being the weakest, disappeared, the subpopulation of Bugs more than doubled, the Ants stayed the same. Importantly, the agents with meta-learned adaptation strategies end up dominating the population.

OpenAI has developed a “learning to learn” (or meta-learning) framework that allows an AI agent to continuously adapt to a dynamic environment, at least in certain conditions. The environment is dynamic for a number of reasons, including the fact that opponents are learning as well.

AI agents equipped with the meta-learning framework win more fights against their opponents and eventually dominate the environment. Be sure to watch the last video to see the effect.

The meta-learning framework gives the selected AI agents the capability to predict and anticipate the changes in the environment and adapt faster than the AI agents that only learn from direct experience.

We know that the neocortex is a prediction machine and that human intelligence amounts to the capability to anticipate and adapt. This research is a key step towards artificial general intelligence.

Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments

From [1710.03641] Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments

Ability to continuously learn and adapt from limited experience in nonstationary environments is an important milestone on the path towards general intelligence. In this paper, we cast the problem of continuous adaptation into the learning-to-learn framework. We develop a simple gradient-based meta-learning algorithm suitable for adaptation in dynamically changing and adversarial scenarios. Additionally, we design a new multi-agent competitive environment, RoboSumo, and define iterated adaptation games for testing various aspects of continuous adaptation strategies. We demonstrate that meta-learning enables significantly more efficient adaptation than reactive baselines in the few-shot regime. Our experiments with a population of agents that learn and compete suggest that meta-learners are the fittest.

And yet, AI is easier to trick than people think

From Robust Adversarial Examples

We’ve created images that reliably fool neural network classifiers when viewed from varied scales and perspectives. This challenges a claim from last week that self-driving cars would be hard to trick maliciously since they capture images from multiple scales, angles, perspectives, and the like.

This innocuous kitten photo, printed on a standard color printer, fools the classifier into thinking it’s a monitor or desktop computer regardless of how its zoomed or rotated. We expect further parameter tuning would also remove any human-visible artifacts.

Watch the videos.