Say What I Want: Towards the Dark Side of Neural Dialogue Models
Neural dialogue models have been widely adopted in various chatbot applications because of their good performance in simulating and generalizing human conversations. However, there exists a dark side of these models -- due to the vulnerability of neural networks, a neural dialogue model can be manipulated by users to say what they want, which brings in concerns about the security of practical chatbot services. In this work, we investigate whether we can craft inputs that lead a well-trained black-box neural dialogue model to generate targeted outputs. We formulate this as a reinforcement learning (RL) problem and train a Reverse Dialogue Generator which efficiently finds such inputs for targeted outputs. Experiments conducted on a representative neural dialogue model show that our proposed model is able to discover such desired inputs in a considerable portion of cases. Overall, our work reveals this weakness of neural dialogue models and may prompt further researches of developing corresponding solutions to avoid it.
NurtureToken New!

Token crowdsale for this paper ends in

Buy Nurture Tokens

Authors

Are you an author of this paper? Check the Twitter handle we have for you is correct.

Haochen Liu (edit)
Tyler Derr (edit)
Zitao Liu (add twitter)
Jiliang Tang (add twitter)
Ask The Authors

Ask the authors of this paper a question or leave a comment.

Read it. Rate it.
#1. Which part of the paper did you read?

#2. The paper contains new data or analyses that is openly accessible?
#3. The conclusion is supported by the data and analyses?
#4. The conclusion is of scientific interest?
#5. The result is likely to lead to future research?

Github
Stargazers:
69
Forks:
26
Open Issues:
0
Network:
86
Subscribers:
3
Language:
None
chat corpus collection from various open sources
Youtube
Link:
None (add)
Views:
0
Likes:
0
Dislikes:
0
Favorites:
0
Comments:
0
Other
Sample Sizes (N=):
Inserted:
Words Total:
Words Unique:
Source:
Abstract:
None
09/15/19 06:04PM
6,352
2,127
Tweets
arxiv_cscl: Say What I Want: Towards the Dark Side of Neural Dialogue Models https://t.co/ZaILI4R21e
arxiv_cscl: Say What I Want: Towards the Dark Side of Neural Dialogue Models https://t.co/ZaILI4R21e
dse_msu: Our research "Say What I Want: Towards the Dark Side of Neural Dialogue Models" by @lhaochen, @tylersnetwork, @yourzi, and @tangjiliang is featured by @katyanna_q in @TheRegister. Preprint: https://t.co/4QWr7uhFZm @MSU_Egr_News @msuresearch https://t.co/rU6SU7uiY0 https://t.co/YCvySwL687
sairsyr: RT @arxiv_cscl: Say What I Want: Towards the Dark Side of Neural Dialogue Models https://t.co/ZaILI4R21e
arxiv_cscl: Say What I Want: Towards the Dark Side of Neural Dialogue Models https://t.co/ZaILI4R21e
arxiv_cscl: Say What I Want: Towards the Dark Side of Neural Dialogue Models https://t.co/ZaILI4R21e
arxiv_cscl: Say What I Want: Towards the Dark Side of Neural Dialogue Models https://t.co/ZaILI4zqCE
Rosenchild: RT @arxiv_cscl: Say What I Want: Towards the Dark Side of Neural Dialogue Models https://t.co/ZaILI4R21e
arxiv_cscl: Say What I Want: Towards the Dark Side of Neural Dialogue Models https://t.co/ZaILI4R21e
arxiv_cs_LG: Say What I Want: Towards the Dark Side of Neural Dialogue Models. Haochen Liu, Tyler Derr, Zitao Liu, and Jiliang Tang https://t.co/v2aylzpuca
memeticyoga: RT @SciFi: Say What I Want: Towards the Dark Side of Neural Dialogue Models. https://t.co/o7fKcpi5PL
SciFi: Say What I Want: Towards the Dark Side of Neural Dialogue Models. https://t.co/o7fKcpi5PL
arxiv_cscl: Say What I Want: Towards the Dark Side of Neural Dialogue Models https://t.co/ZaILI4R21e
arxivml: "Say What I Want: Towards the Dark Side of Neural Dialogue Models", Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang https://t.co/oEGgrLCcRa
arxiv_cs_LG: Say What I Want: Towards the Dark Side of Neural Dialogue Models. Haochen Liu, Tyler Derr, Zitao Liu, and Jiliang Tang https://t.co/v2aylzpuca
BrundageBot: Say What I Want: Towards the Dark Side of Neural Dialogue Models. Haochen Liu, Tyler Derr, Zitao Liu, and Jiliang Tang https://t.co/YSpRnGMK9Q
Images
Related