Deep Reinforcement Learning with Feedback-based Exploration
Deep Reinforcement Learning has enabled the control of increasingly complex and high-dimensional problems. However, the need of vast amounts of data before reasonable performance is attained prevents its widespread application. We employ binary corrective feedback as a general and intuitive manner to incorporate human intuition and domain knowledge in model-free machine learning. The uncertainty in the policy and the corrective feedback is combined directly in the action space as probabilistic conditional exploration. As a result, the greatest part of the otherwise ignorant learning process can be avoided. We demonstrate the proposed method, Predictive Probabilistic Merging of Policies (PPMP), in combination with DDPG. In experiments on continuous control problems of the OpenAI Gym, we achieve drastic improvements in sample efficiency, final performance, and robustness to erroneous feedback, both for human and synthetic feedback. Additionally, we show solutions beyond the demonstrated knowledge.
NurtureToken New!

Token crowdsale for this paper ends in

Buy Nurture Tokens

Authors

Are you an author of this paper? Check the Twitter handle we have for you is correct.

Jan Scholten (add twitter)
Daan Wout (add twitter)
Carlos Celemin (add twitter)
Jens Kober (add twitter)
Ask The Authors

Ask the authors of this paper a question or leave a comment.

Read it. Rate it.
#1. Which part of the paper did you read?

#2. The paper contains new data or analyses that is openly accessible?
#3. The conclusion is supported by the data and analyses?
#4. The conclusion is of scientific interest?
#5. The result is likely to lead to future research?

Github
Repo:
Stargazers:
3
Forks:
0
Open Issues:
0
Network:
0
Subscribers:
2
Language:
Python
The Predictive Probabilistic Merging of Policies algorithm, implemented in combination with DDPG using Python
Youtube
Link:
None (add)
Views:
0
Likes:
0
Dislikes:
0
Favorites:
0
Comments:
0
Other
Sample Sizes (N=):
Inserted:
Words Total:
Words Unique:
Source:
Abstract:
None
03/14/19 06:00PM
5,746
2,059
Tweets
SagarSharma4244: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
arxiv_pop: 2019/03/14 投稿 4位 LG(Machine Learning) Deep Reinforcement Learning with Feedback-based Exploration https://t.co/PFSP68Gh6d 8 Tweets 17 Retweets 39 Favorites
FrederickBasti: RT @marcusborba: Deep Reinforcement Learning with Feedback-based Exploration https://t.co/H0FP2ouJSt @arxiv_org #ArtificialIntelligence…
marcusborba: RT @marcusborba: Deep Reinforcement Learning with Feedback-based Exploration https://t.co/H0FP2ouJSt @arxiv_org #ArtificialIntelligence…
shubh_300595: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
KouroshMeshgi: RT @marcusborba: Deep Reinforcement Learning with Feedback-based Exploration https://t.co/H0FP2ouJSt @arxiv_org #ArtificialIntelligence…
machinelearn_d: RT @marcusborba: Deep Reinforcement Learning with Feedback-based Exploration https://t.co/H0FP2ouJSt @arxiv_org #ArtificialIntelligence…
marcusborba: Deep Reinforcement Learning with Feedback-based Exploration https://t.co/H0FP2ouJSt @arxiv_org #ArtificialIntelligence #DeepLearning #DataScience #MachineLearning #AI https://t.co/6NFH1jgvjd
jie_song: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
udmrzn: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
ElectronNest: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
ceobillionaire: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
omnidelic: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
indy9000: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
IntuitMachine: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
koulanurag: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
mvaldenegro: RT @arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
arxiv_org: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/JfxvqPMlcF https://t.co/CJjnYkstPn
arxivml: "Deep Reinforcement Learning with Feedback-based Exploration", Jan Scholten, Daan Wout, Carlos Celemin, Jens Kober https://t.co/CaYNfIOhan
arxiv_cs_LG: Deep Reinforcement Learning with Feedback-based Exploration. Jan Scholten, Daan Wout, Carlos Celemin, and Jens Kober https://t.co/BaGzCssQjX
SciFi: Deep Reinforcement Learning with Feedback-based Exploration. https://t.co/hzd6cX2psE
BrundageBot: Deep Reinforcement Learning with Feedback-based Exploration. Jan Scholten, Daan Wout, Carlos Celemin, and Jens Kober https://t.co/VIPQsMsVs3
Images
Related