Top 10 Arxiv Papers Today in Computation And Language


0.0 Mikeys
#1. Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference
Timo Schick, Hinrich Schütze
Some NLP tasks can be solved in a fully unsupervised fashion by providing a pretrained language model with "task descriptions" in natural language (e.g., Radford et al., 2019). While this approach underperforms its supervised counterpart, we show in this work that the two ideas can be combined: We introduce Pattern-Exploiting Training (PET), a semi-supervised training procedure that reformulates input examples as cloze-style phrases which help the language model understand the given task. Theses phrases are then used to assign soft labels to a large set of unlabeled examples. Finally, regular supervised training is performed on the resulting training set. On several tasks, we show that PET outperforms both supervised training and unsupervised approaches in low-resource settings by a large margin.
more | pdf | html
Figures
None.
Tweets
timo_schick: new paper: we show how pretrained language models can learn downstream tasks from just a handful of examples: https://t.co/Ltk8QhAiN8 - feedback appreciated :)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#2. Generating Sense Embeddings for Syntactic and Semantic Analogy for Portuguese
Jessica Rodrigues da Silva, Helena de Medeiros Caseli
Word embeddings are numerical vectors which can represent words or concepts in a low-dimensional continuous space. These vectors are able to capture useful syntactic and semantic information. The traditional approaches like Word2Vec, GloVe and FastText have a strict drawback: they produce a single vector representation per word ignoring the fact that ambiguous words can assume different meanings. In this paper we use techniques to generate sense embeddings and present the first experiments carried out for Portuguese. Our experiments show that sense vectors outperform traditional word vectors in syntactic and semantic analogy tasks, proving that the language resource generated here can improve the performance of NLP tasks in Portuguese.
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#3. Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems
Vevake Balaraman, Bernardo Magnini
In task-oriented dialogue systems the dialogue state tracker (DST) component is responsible for predicting the state of the dialogue based on the dialogue history. Current DST approaches rely on a predefined domain ontology, a fact that limits their effective usage for large scale conversational agents, where the DST constantly needs to be interfaced with ever-increasing services and APIs. Focused towards overcoming this drawback, we propose a domain-aware dialogue state tracker, that is completely data-driven and it is modeled to predict for dynamic service schemas. The proposed model utilizes domain and slot information to extract both domain and slot specific representations for a given dialogue, and then uses such representations to predict the values of the corresponding slot. Integrating this mechanism with a pretrained language model (i.e. BERT), our approach can effectively learn semantic relations.
more | pdf | html
Figures
None.
Tweets
arxivml: "Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems", Vevake Balaraman, Bernardo Magnini https://t.co/EiLb7acQTH
SciFi: Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems. https://t.co/wGJ7BDWUo2
arxiv_cscl: Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems https://t.co/bubB5SELgS
arxiv_cscl: Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems https://t.co/bubB5SELgS
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#4. A Physical Embedding Model for Knowledge Graphs
Caglar Demir, Axel-Cyrille Ngonga Ngomo
Knowledge graph embedding methods learn continuous vector representations for entities in knowledge graphs and have been used successfully in a large number of applications. We present a novel and scalable paradigm for the computation of knowledge graph embeddings, which we dub PYKE . Our approach combines a physical model based on Hooke's law and its inverse with ideas from simulated annealing to compute embeddings for knowledge graphs efficiently. We prove that PYKE achieves a linear space complexity. While the time complexity for the initialization of our approach is quadratic, the time complexity of each of its iterations is linear in the size of the input knowledge graph. Hence, PYKE's overall runtime is close to linear. Consequently, our approach easily scales up to knowledge graphs containing millions of triples. We evaluate our approach against six state-of-the-art embedding approaches on the DrugBank and DBpedia datasets in two series of experiments. The first series shows that the cluster purity achieved by PYKE is up to...
more | pdf | html
Figures
None.
Tweets
BrundageBot: A Physical Embedding Model for Knowledge Graphs. Caglar Demir and Axel-Cyrille Ngonga Ngomo https://t.co/n4Cs7mA7EH
arxiv_cscl: A Physical Embedding Model for Knowledge Graphs https://t.co/Zh1jl81gMO
arxiv_cscl: A Physical Embedding Model for Knowledge Graphs https://t.co/Zh1jl81gMO
arxiv_cscl: A Physical Embedding Model for Knowledge Graphs https://t.co/Zh1jl81gMO
gaialive: RT @arxiv_cscl: A Physical Embedding Model for Knowledge Graphs https://t.co/Zh1jl81gMO
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#5. Length-controllable Abstractive Summarization by Guiding with Summary Prototype
Itsumi Saito, Kyosuke Nishida, Kosuke Nishida, Atsushi Otsuka, Hisako Asano, Junji Tomita, Hiroyuki Shindo, Yuji Matsumoto
We propose a new length-controllable abstractive summarization model. Recent state-of-the-art abstractive summarization models based on encoder-decoder models generate only one summary per source text. However, controllable summarization, especially of the length, is an important aspect for practical applications. Previous studies on length-controllable abstractive summarization incorporate length embeddings in the decoder module for controlling the summary length. Although the length embeddings can control where to stop decoding, they do not decide which information should be included in the summary within the length constraint. Unlike the previous models, our length-controllable abstractive summarization model incorporates a word-level extractive module in the encoder-decoder model instead of length embeddings. Our model generates a summary in two steps. First, our word-level extractor extracts a sequence of important words (we call it the "prototype text") from the source text according to the word-level importance scores and...
more | pdf | html
Figures
None.
Tweets
arxiv_cscl: Length-controllable Abstractive Summarization by Guiding with Summary Prototype https://t.co/elZIW1cB97
arxiv_cscl: Length-controllable Abstractive Summarization by Guiding with Summary Prototype https://t.co/elZIW1cB97
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 8
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#6. Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching
Shuohang Wang, Yunshi Lan, Yi Tay, Jing Jiang, Jingjing Liu
Transformer has been successfully applied to many natural language processing tasks. However, for textual sequence matching, simple matching between the representation of a pair of sequences might bring in unnecessary noise. In this paper, we propose a new approach to sequence pair matching with Transformer, by learning head-wise matching representations on multiple levels. Experiments show that our proposed approach can achieve new state-of-the-art performance on multiple tasks that rely only on pre-computed sequence-vector-representation, such as SNLI, MNLI-match, MNLI-mismatch, QQP, and SQuAD-binary.
more | pdf | html
Figures
None.
Tweets
arxiv_cscl: Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching https://t.co/ZqukNpx9uN
arxiv_cscl: Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching https://t.co/ZqukNpx9uN
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#7. Text-based inference of moral sentiment change
Jing Yi Xie, Renato Ferreira Pinto Jr., Graeme Hirst, Yang Xu
We present a text-based framework for investigating moral sentiment change of the public via longitudinal corpora. Our framework is based on the premise that language use can inform people's moral perception toward right or wrong, and we build our methodology by exploring moral biases learned from diachronic word embeddings. We demonstrate how a parameter-free model supports inference of historical shifts in moral sentiment toward concepts such as slavery and democracy over centuries at three incremental levels: moral relevance, moral polarity, and fine-grained moral dimensions. We apply this methodology to visualizing moral time courses of individual concepts and analyzing the relations between psycholinguistic variables and rates of moral sentiment change at scale. Our work offers opportunities for applying natural language processing toward characterizing moral sentiment change in society.
more | pdf | html
Figures
None.
Tweets
arxiv_cscl: Text-based inference of moral sentiment change https://t.co/eag9LsQ5OQ
arxiv_cscl: Text-based inference of moral sentiment change https://t.co/eag9Lt7Hdq
arxiv_cscl: Text-based inference of moral sentiment change https://t.co/eag9LsQ5OQ
arxiv_cscl: Text-based inference of moral sentiment change https://t.co/eag9Lt7Hdq
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#8. Nested-Wasserstein Self-Imitation Learning for Sequence Generation
Ruiyi Zhang, Changyou Chen, Zhe Gan, Zheng Wen, Wenlin Wang, Lawrence Carin
Reinforcement learning (RL) has been widely studied for improving sequence-generation models. However, the conventional rewards used for RL training typically cannot capture sufficient semantic information and therefore render model bias. Further, the sparse and delayed rewards make RL exploration inefficient. To alleviate these issues, we propose the concept of nested-Wasserstein distance for distributional semantic matching. To further exploit it, a novel nested-Wasserstein self-imitation learning framework is developed, encouraging the model to exploit historical high-rewarded sequences for enhanced exploration and better semantic matching. Our solution can be understood as approximately executing proximal policy optimization with Wasserstein trust-regions. Experiments on a variety of unconditional and conditional sequence-generation tasks demonstrate the proposed approach consistently leads to improved performance.
more | pdf | html
Figures
None.
Tweets
arxivml: "Nested-Wasserstein Self-Imitation Learning for Sequence Generation", Ruiyi Zhang, Changyou Chen, Zhe Gan, Zheng We… https://t.co/K5uTTUgb7v
Memoirs: Nested-Wasserstein Self-Imitation Learning for Sequence Generation. https://t.co/Fufgn41ht0
arxiv_cscl: Nested-Wasserstein Self-Imitation Learning for Sequence Generation https://t.co/jLd3IwZcPY
arxiv_cscl: Nested-Wasserstein Self-Imitation Learning for Sequence Generation https://t.co/jLd3IwZcPY
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 6
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#9. Audio Summarization with Audio Features and Probability Distribution Divergence
Carlos-Emiliano González-Gallardo, Romain Deveaud, Eric SanJuan, Juan-Manuel Torres
The automatic summarization of multimedia sources is an important task that facilitates the understanding of an individual by condensing the source while maintaining relevant information. In this paper we focus on audio summarization based on audio features and the probability of distribution divergence. Our method, based on an extractive summarization approach, aims to select the most relevant segments until a time threshold is reached. It takes into account the segment's length, position and informativeness value. Informativeness of each segment is obtained by mapping a set of audio features issued from its Mel-frequency Cepstral Coefficients and their corresponding Jensen-Shannon divergence score. Results over a multi-evaluator scheme shows that our approach provides understandable and informative summaries.
more | pdf | html
Figures
None.
Tweets
arxiv_cscl: Audio Summarization with Audio Features and Probability Distribution Divergence https://t.co/8PkPvtCc3Y
arxiv_cscl: Audio Summarization with Audio Features and Probability Distribution Divergence https://t.co/8PkPvtCc3Y
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#10. Improving Interaction Quality Estimation with BiLSTMs and the Impact on Dialogue Policy Learning
Stefan Ultes
Learning suitable and well-performing dialogue behaviour in statistical spoken dialogue systems has been in the focus of research for many years. While most work which is based on reinforcement learning employs an objective measure like task success for modelling the reward signal, we use a reward based on user satisfaction estimation. We propose a novel estimator and show that it outperforms all previous estimators while learning temporal dependencies implicitly. Furthermore, we apply this novel user satisfaction estimation model live in simulated experiments where the satisfaction estimation model is trained on one domain and applied in many other domains which cover a similar task. We show that applying this model results in higher estimated satisfaction, similar task success rates and a higher robustness to noise.
more | pdf | html
Figures
None.
Tweets
arxivml: "Improving Interaction Quality Estimation with BiLSTMs and the Impact on Dialogue Policy Learning", Stefan Ultes https://t.co/yih0zwfWHY
StatsPapers: Improving Interaction Quality Estimation with BiLSTMs and the Impact on Dialogue Policy Learning. https://t.co/LLi2lqXMUn
Github

Attention mechanism for processing sequential data that considers the context for each timestamp.

Repository: keras-self-attention
User: CyberZHG
Language: Python
Stargazers: 305
Subscribers: 9
Forks: 84
Open Issues: 0
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 6181
Unqiue Words: 2029

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 257,111 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 257,111 papers.