Human-level performance in first-person multiplayer games with population-based deep reinforcement learning
Recent progress in artificial intelligence through reinforcement learning (RL) has shown great success on increasingly complex single-agent environments and two-player turn-based games. However, the real-world contains multiple agents, each learning and acting independently to cooperate and compete with other agents, and environments reflecting this degree of complexity remain an open challenge. In this work, we demonstrate for the first time that an agent can achieve human-level in a popular 3D multiplayer first-person video game, Quake III Arena Capture the Flag, using only pixels and game points as input. These results were achieved by a novel two-tier optimisation process in which a population of independent RL agents are trained concurrently from thousands of parallel matches with agents playing in teams together and against each other on randomly generated environments. Each agent in the population learns its own internal reward signal to complement the sparse delayed reward from winning, and selects actions using a novel temporally hierarchical representation that enables the agent to reason at multiple timescales. During game-play, these agents display human-like behaviours such as navigating, following, and defending based on a rich learned representation that is shown to encode high-level game knowledge. In an extensive tournament-style evaluation the trained agents exceeded the win-rate of strong human players both as teammates and opponents, and proved far stronger than existing state-of-the-art agents. These results demonstrate a significant jump in the capabilities of artificial agents, bringing us closer to the goal of human-level intelligence.
NurtureToken New!

Token crowdsale for this paper ends in

Buy Nurture Tokens

Authors

Are you an author of this paper? Check the Twitter handle we have for you is correct.

Max Jaderberg (add twitter)
Wojciech M. Czarnecki (add twitter)
Iain Dunning (add twitter)
Luke Marris (add twitter)
Guy Lever (add twitter)
Antonio Garcia Castaneda (add twitter)
Charles Beattie (add twitter)
Neil C. Rabinowitz (add twitter)
Ari S. Morcos (add twitter)
Avraham Ruderman (add twitter)
Nicolas Sonnerat (add twitter)
Tim Green (add twitter)
Louise Deason (add twitter)
Joel Z. Leibo (add twitter)
David Silver (add twitter)
Demis Hassabis (add twitter)
Koray Kavukcuoglu (add twitter)
Thore Graepel (add twitter)
Ask The Authors

Ask the authors of this paper a question or leave a comment.

Read it. Rate it.
#1. Which part of the paper did you read?

#2. The paper contains new data or analyses that is openly accessible?
#3. The conclusion is supported by the data and analyses?
#4. The conclusion is of scientific interest?
#5. The result is likely to lead to future research?

Github
User:
None (add)
Repo:
None (add)
Stargazers:
0
Forks:
0
Open Issues:
0
Network:
0
Subscribers:
0
Language:
None
Youtube
Link:
None (add)
Views:
0
Likes:
0
Dislikes:
0
Favorites:
0
Comments:
0
Other
Sample Sizes (N=):
Inserted:
Words Total:
Words Unique:
Source:
Abstract:
None
07/03/18 06:23PM
15,041
4,348
Tweets
kafafafafae812: サイエンス版より読みやすいだろうと思って https://t.co/LExsxhfDGQ
loopuleasa: @DeepMindAI destroying people at Quake III Arena "[AI players] were rated more collaborative than human participants." Video: https://t.co/4rUSVBMu8Z Blog: https://t.co/fLb4VKWyVk Paper: https://t.co/P4KNw3YNXT https://t.co/VaaqwvgRHH
gengelstein: Researchers have taught an AI to play (multiplayer) Quake III Capture-the-Flag with humans as well as other AIs. https://t.co/CpYymDVsDH
postrational: RT @robinc: AlphaStar v0 starts with imitation learning. League uses and population-based RL reminicent of https://t.co/yR2MDpuTfl https:/…
robinc: AlphaStar v0 starts with imitation learning. League uses and population-based RL reminicent of https://t.co/yR2MDpuTfl https://t.co/xcjE4Z0u40 #alphastar #DeepMind #ReinforcementLearning #MachineLearning #StarCraft2
boredtensor: this windows screensaver may have inspired Human-level performance in first-person multiplayer games with population-based deep reinforcement learning (DeepMind) https://t.co/oUJ5f1mRfb shrug 🤷‍♀️ https://t.co/X29N42Dhpr
FMarradi: Human-level performance in first-person multiplayer games with population-based deep reinforcement learning https://t.co/503ty3L8BN
aneel: Human-level performance in first-person multiplayer games with population-based deep reinforcement learning https://t.co/xrdZXbtHy0
lewisshepherd: Very interesting insight in this thread on an interesting new #AI paper (Google DeepMind's neural net taught itself to play Quake 3; paper at https://t.co/75vXDfN3tR). Possible implications for industrial, military & other “human/AI teaming” scenarios. https://t.co/dwfD37whJn
hatebuholic: [機械学習] [1807.01281] Human-level performance in first-person multiplayer games with population-bas https://t.co/6Rc2Mu3Gui
j6m8: New #365papers post: Human-level performance in first-person multiplayer games with population-based deep reinforcement learning https://t.co/jq7NkoMwgM https://t.co/7Uwrv8gq0G
Ohgochi: これだね。DeepMindのAI協調プレイ論文。 Human-level performance in first-person multiplayer games with population-based deep reinforcement learning https://t.co/WuihkSiHsb https://t.co/UrwqswUu5q
siuying: Human-level performance in first-person multiplayer games with population-based deep reinforcement learning https://t.co/dguG4ICdBn
Images
Related