### Top 10 Arxiv Papers Today in Machine Learning

##### #1. OmniNet: A unified architecture for multi-modal multi-task learning
###### Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
Transformer is a popularly used neural network architecture, especially for language understanding. We introduce an extended and unified architecture which can be used for tasks involving a variety of modalities like image, text, videos, etc. We propose a spatio-temporal cache mechanism that enables learning spatial dimension of the input in addition to the hidden states corresponding to the temporal input sequence. The proposed architecture further enables a single model to support tasks with multiple input modalities as well as asynchronous multi-task learning, thus we refer to it as OmniNet. For example, a single instance of OmniNet can concurrently learn to perform the tasks of part-of-speech tagging, image captioning, visual question answering and video activity recognition. We demonstrate that training these four tasks together results in about three times compressed model while retaining the performance in comparison to training them individually. We also show that using this neural network pre-trained on some modalities...
more | pdf | html
None.
###### Tweets
BrundageBot: OmniNet: A unified architecture for multi-modal multi-task learning. Subhojeet Pramanik, Priyanka Agrawal, and Aman Hussain https://t.co/ifFUyz1vDI
shunk031: OmniNet: A unified architecture for multi-modal multi-task learning. (arXiv:1907.07804v1 [cs.LG]) https://t.co/YssxeI0lAS
arxivml: "OmniNet: A unified architecture for multi-modal multi-task learning", Subhojeet Pramanik, Priyanka Agrawal, Aman H… https://t.co/IzswUnNFFM
subho406: OmniNet is the first-ever truly universal architecture for multi-modal multi-task learning. https://t.co/VPsnqwClTj #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks @JeffDean @hardmaru @iamtrask @IBMResearch @Google @Microsoft https://t.co/Sv1iC42Uvg
StatsPapers: OmniNet: A unified architecture for multi-modal multi-task learning. https://t.co/zcqnauKYkp
arxiv_cscv: OmniNet: A unified architecture for multi-modal multi-task learning https://t.co/8DeYZMwZVP
arxiv_cscv: OmniNet: A unified architecture for multi-modal multi-task learning https://t.co/8DeYZMwZVP
arxiv_cscl: OmniNet: A unified architecture for multi-modal multi-task learning https://t.co/5imNbatxK9
arxiv_cscl: OmniNet: A unified architecture for multi-modal multi-task learning https://t.co/5imNbatxK9
arxiv_cscl: OmniNet: A unified architecture for multi-modal multi-task learning https://t.co/5imNbatxK9
hrsma2i: RT @shunk031: OmniNet: A unified architecture for multi-modal multi-task learning. (arXiv:1907.07804v1 [cs.LG]) https://t.co/YssxeI0lAS
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

##### #2. MintNet: Building Invertible Neural Networks with Masked Convolutions
###### Yang Song, Chenlin Meng, Stefano Ermon
We propose a new way of constructing invertible neural networks by combining simple building blocks with a novel set of composition rules. This leads to a rich set of invertible architectures, including those similar to ResNets. Inversion is achieved with a locally convergent iterative procedure that is parallelizable and very fast in practice. Additionally, the determinant of the Jacobian can be computed analytically and efficiently, enabling their generative use as flow models. To demonstrate their flexibility, we show that our invertible neural networks are competitive with ResNets on MNIST and CIFAR-10 classification. When trained as generative models, our invertible networks achieve new state-of-the-art likelihoods on MNIST, CIFAR-10 and ImageNet 32x32, with bits per dimension of 0.98, 3.32 and 4.06 respectively.
more | pdf | html
###### Tweets
BrundageBot: MintNet: Building Invertible Neural Networks with Masked Convolutions. Yang Song, Chenlin Meng, and Stefano Ermon https://t.co/alTMuqf8i6
arxivml: "MintNet: Building Invertible Neural Networks with Masked Convolutions", Yang Song, Chenlin Meng, Stefano Ermon https://t.co/9amPEfb0g9
YSongStanford: Joint work with Chenlin Meng and @ermonste. Paper link: https://t.co/TFZIUMwHfg
arxiv_cs_LG: MintNet: Building Invertible Neural Networks with Masked Convolutions. Yang Song, Chenlin Meng, and Stefano Ermon https://t.co/2cNrhyijsc
StatsPapers: MintNet: Building Invertible Neural Networks with Masked Convolutions. https://t.co/hn3pxQzLfm
arxiv_cscv: MintNet: Building Invertible Neural Networks with Masked Convolutions https://t.co/Hq1xfNMz5D
arxiv_cscv: MintNet: Building Invertible Neural Networks with Masked Convolutions https://t.co/Hq1xfNMz5D
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 8701
Unqiue Words: 2108

##### #3. Deep Multi-View Learning via Task-Optimal CCA
###### Heather D. Couture, Roland Kwitt, J. S. Marron, Melissa Troester, Charles M. Perou, Marc Niethammer
Canonical Correlation Analysis (CCA) is widely used for multimodal data analysis and, more recently, for discriminative tasks such as multi-view learning; however, it makes no use of class labels. Recent CCA methods have started to address this weakness but are limited in that they do not simultaneously optimize the CCA projection for discrimination and the CCA projection itself, or they are linear only. We address these deficiencies by simultaneously optimizing a CCA-based and a task objective in an end-to-end manner. Together, these two objectives learn a non-linear CCA projection to a shared latent space that is highly correlated and discriminative. Our method shows a significant improvement over previous state-of-the-art (including deep supervised approaches) for cross-view classification, regularization with a second view, and semi-supervised learning on real data.
more | pdf | html
None.
###### Tweets
arxiv_org: Deep Multi-View Learning via Task-Optimal CCA. https://t.co/VFzUIb5d2m https://t.co/w3WNXuX3Z9
arxivml: "Deep Multi-View Learning via Task-Optimal CCA", Heather D． Couture, Roland Kwitt, J．S． Marron, Melissa Troester, C… https://t.co/aOTkV1YP7H
StatsPapers: Deep Multi-View Learning via Task-Optimal CCA. https://t.co/IROZimItBe
arxiv_cscv: Deep Multi-View Learning via Task-Optimal CCA https://t.co/0vrUFLxiMu
arxiv_cscv: Deep Multi-View Learning via Task-Optimal CCA https://t.co/0vrUFLfHnU
arxiv_cscv: Deep Multi-View Learning via Task-Optimal CCA https://t.co/0vrUFLxiMu
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 6
Total Words: 7401
Unqiue Words: 2282

##### #4. Design and Evaluation of Product Aesthetics: A Human-Machine Hybrid Approach
###### Alex Burnap, John R. Hauser, Artem Timoshenko
Aesthetics are critically important to market acceptance in many product categories. In the automotive industry in particular, an improved aesthetic design can boost sales by 30% or more. Firms invest heavily in designing and testing new product aesthetics. A single automotive "theme clinic" costs between \$100,000 and \$1,000,000, and hundreds are conducted annually. We use machine learning to augment human judgment when designing and testing new product aesthetics. The model combines a probabilistic variational autoencoder (VAE) and adversarial components from generative adversarial networks (GAN), along with modeling assumptions that address managerial requirements for firm adoption. We train our model with data from an automotive partner-7,000 images evaluated by targeted consumers and 180,000 high-quality unrated images. Our model predicts well the appeal of new aesthetic designs-38% improvement relative to a baseline and substantial improvement over both conventional machine learning models and pretrained deep learning...
more | pdf | html
None.
###### Tweets
BrundageBot: Design and Evaluation of Product Aesthetics: A Human-Machine Hybrid Approach. Alex Burnap, John R. Hauser, and Artem Timoshenko https://t.co/PiFBJ2M8GQ
arxivml: "Design and Evaluation of Product Aesthetics: A Human-Machine Hybrid Approach", Alex Burnap, John R． Hauser, Artem … https://t.co/63DIUOgUC4
StatsPapers: Design and Evaluation of Product Aesthetics: A Human-Machine Hybrid Approach. https://t.co/SQEGY8CkQY
arxiv_cscv: Design and Evaluation of Product Aesthetics: A Human-Machine Hybrid Approach https://t.co/IKgbNDUy7N
arxiv_cscv: Design and Evaluation of Product Aesthetics: A Human-Machine Hybrid Approach https://t.co/IKgbNDUy7N
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

##### #5. Convolutional Reservoir Computing for World Models
###### Hanten Chang, Katsuya Futagami
Recently, reinforcement learning models have achieved great success, completing complex tasks such as mastering Go and other games with higher scores than human players. Many of these models collect considerable data on the tasks and improve accuracy by extracting visual and time-series features using convolutional neural networks (CNNs) and recurrent neural networks, respectively. However, these networks have very high computational costs because they need to be trained by repeatedly using a large volume of past playing data. In this study, we propose a novel practical approach called reinforcement learning with convolutional reservoir computing (RCRC) model. The RCRC model has several desirable features: 1. it can extract visual and time-series features very fast because it uses random fixed-weight CNN and the reservoir computing model; 2. it does not require the training data to be stored because it extracts features without training and decides action with evolution strategy. Furthermore, the model achieves state of the art...
more | pdf | html
None.
###### Tweets
BrundageBot: Convolutional Reservoir Computing for World Models. Hanten Chang and Katsuya Futagami https://t.co/CDmOJs2dWc
arxivml: "Convolutional Reservoir Computing for World Models", Hanten Chang, Katsuya Futagami https://t.co/XNzvRHY13E
arxiv_cs_LG: Convolutional Reservoir Computing for World Models. Hanten Chang and Katsuya Futagami https://t.co/Z1GOTdv7CK
StatsPapers: Convolutional Reservoir Computing for World Models. https://t.co/AOvlJIZzjU
a_nemecek: RT @arxivml: "Convolutional Reservoir Computing for World Models", Hanten Chang, Katsuya Futagami https://t.co/XNzvRHY13E
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

##### #6. Learning Privately over Distributed Features: An ADMM Sharing Approach
###### Yaochen Hu, Peng Liu, Linglong Kong, Di Niu
Distributed machine learning has been widely studied in order to handle exploding amount of data. In this paper, we study an important yet less visited distributed learning problem where features are inherently distributed or vertically partitioned among multiple parties, and sharing of raw data or model parameters among parties is prohibited due to privacy concerns. We propose an ADMM sharing framework to approach risk minimization over distributed features, where each party only needs to share a single value for each sample in the training process, thus minimizing the data leakage risk. We establish convergence and iteration complexity results for the proposed parallel ADMM algorithm under non-convex loss. We further introduce a novel differentially private ADMM sharing algorithm and bound the privacy guarantee with carefully designed noise perturbation. The experiments based on a prototype system shows that the proposed ADMM algorithms converge efficiently in a robust fashion, demonstrating advantage over gradient based methods...
more | pdf | html
None.
###### Tweets
arxiv_org: Learning Privately over Distributed Features: An ADMM Sharing Approach. https://t.co/4kAiNCFlXM https://t.co/MWSpUWRAnV
BrundageBot: Learning Privately over Distributed Features: An ADMM Sharing Approach. Yaochen Hu, Peng Liu, Linglong Kong, and Di Niu https://t.co/ojn9cnLFWI
arxivml: "Learning Privately over Distributed Features: An ADMM Sharing Approach", Yaochen Hu, Peng Liu, Linglong Kong, Di N… https://t.co/9pFz6BpZ5N
StatsPapers: Learning Privately over Distributed Features: An ADMM Sharing Approach. https://t.co/Fe17LsnW28
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 4
Total Words: 0
Unqiue Words: 0

##### #7. Credit Assignment as a Proxy for Transfer in Reinforcement Learning
###### Johan Ferret, Raphaël Marinier, Matthieu Geist, Olivier Pietquin
The ability to transfer representations to novel environments and tasks is a sensible requirement for general learning agents. Despite the apparent promises, transfer in Reinforcement Learning is still an open and under-exploited research area. In this paper, we suggest that credit assignment, regarded as a supervised learning task, could be used to accomplish transfer. Our contribution is twofold: we introduce a new credit assignment mechanism based on self-attention, and show that the learned credit can be transferred to in-domain and out-of-domain scenarios.
more | pdf | html
None.
###### Tweets
BrundageBot: Credit Assignment as a Proxy for Transfer in Reinforcement Learning. Johan Ferret, Raphaël Marinier, Matthieu Geist, and Olivier Pietquin https://t.co/Vgjv2JVZmY
arxivml: "Credit Assignment as a Proxy for Transfer in Reinforcement Learning", Johan Ferret, Raphaël Marinier, Matthieu Gei… https://t.co/MjdT9pUpP4
arxiv_cs_LG: Credit Assignment as a Proxy for Transfer in Reinforcement Learning. Johan Ferret, Raphaël Marinier, Matthieu Geist, and Olivier Pietquin https://t.co/7Xkg70uNUX
SciFi: Credit Assignment as a Proxy for Transfer in Reinforcement Learning. https://t.co/LvlByFAV7n
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 4
Total Words: 0
Unqiue Words: 0

##### #8. Neural Shuffle-Exchange Networks $-$ Sequence Processing in O(n log n) Time
###### Kārlis Freivalds, Emīls Ozoliņš, Agris Šostaks
A key requirement in sequence to sequence processing is the modeling of long range dependencies. To this end, a vast majority of the state-of-the-art models use attention mechanism which is of O($n^2$) complexity that leads to slow execution for long sequences. We introduce a new Shuffle-Exchange neural network model for sequence to sequence tasks which have O(log n) depth and O(n log n) total complexity. We show that this model is powerful enough to infer efficient algorithms for common algorithmic benchmarks including sorting, addition and multiplication. We evaluate our architecture on the challenging LAMBADA question answering dataset and compare it with the state-of-the-art models which use attention. Our model achieves competitive accuracy and scales to sequences with more than a hundred thousand of elements. We are confident that the proposed model has the potential for building more efficient architectures for processing large interrelated data in language modeling, music generation and other application domains.
more | pdf | html
None.
###### Tweets
BrundageBot: Neural Shuffle-Exchange Networks $-$ Sequence Processing in O(n log n) Time. Kārlis Freivalds, Emīls Ozoliņš, and Agris Šostaks https://t.co/nm975lC8B3
arxivml: "Neural Shuffle-Exchange Networks $-$ Sequence Processing in O(n log n) Time", Kārlis Freivalds, Emīls Ozoliņš, Agr… https://t.co/z5EiSvL9A5
arxiv_cs_LG: Neural Shuffle-Exchange Networks $-$ Sequence Processing in O(n log n) Time. Kārlis Freivalds, Emīls Ozoliņš, and Agris Šostaks https://t.co/Y9GUun3wCY
Memoirs: Neural Shuffle-Exchange Networks $-$ Sequence Processing in O(n log n) Time. https://t.co/p4S7mPD9tv
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

##### #9. Probabilistic Regressor Chains with Monte Carlo Methods
###### Jesse Read, Luca Martino
A large number and diversity of techniques have been offered in the literature in recent years for solving multi-label classification tasks, including classifier chains where predictions are cascaded to other models as additional features. The idea of extending this chaining methodology to multi-output regression has already been suggested and trialed: regressor chains. However, this has so-far been limited to greedy inference and has provided relatively poor results compared to individual models, and of limited applicability. In this paper we identify and discuss the main limitations, including an analysis of different base models, loss functions, explainability, and other desiderata of real-world applications. To overcome the identified limitations we study and develop methods for regressor chains. In particular we present a sequential Monte Carlo scheme in the framework of a probabilistic regressor chain, and we show it can be effective, flexible and useful in several types of data. We place regressor chains in context in...
more | pdf | html
None.
###### Tweets
BrundageBot: Probabilistic Regressor Chains with Monte Carlo Methods. Jesse Read and Luca Martino https://t.co/JarhAGZqE9
arxivml: "Probabilistic Regressor Chains with Monte Carlo Methods", Jesse Read, Luca Martino https://t.co/8dadcjcxiD
arxiv_cs_LG: Probabilistic Regressor Chains with Monte Carlo Methods. Jesse Read and Luca Martino https://t.co/xJiEMUyfJS
StatsPapers: Probabilistic Regressor Chains with Monte Carlo Methods. https://t.co/c55ASOBLeb
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

##### #10. Autoencoder-Based Incremental Class Learning without Retraining on Old Data
###### Euntae Choi, Kyungmi Lee, Kiyoung Choi
Incremental class learning, a scenario in continual learning context where classes and their training data are sequentially and disjointedly observed, challenges a problem widely known as catastrophic forgetting. In this work, we propose a novel incremental class learning method that can significantly reduce memory overhead compared to previous approaches. Apart from conventional classification scheme using softmax, our model bases on an autoencoder to extract prototypes for given inputs so that no change in its output unit is required. It stores only the mean of prototypes per class to perform metric-based classification, unlike rehearsal approaches which rely on large memory or generative model. To mitigate catastrophic forgetting, regularization methods are applied on our model when a new task is encountered. We evaluate our method by experimenting on CIFAR-100 and CUB-200-2011 and show that its performance is comparable to the state-of-the-art method with much lower additional memory cost.
more | pdf | html
None.
###### Tweets
BrundageBot: Autoencoder-Based Incremental Class Learning without Retraining on Old Data. Euntae Choi, Kyungmi Lee, and Kiyoung Choi https://t.co/ScOGoUwO1z
arxivml: "Autoencoder-Based Incremental Class Learning without Retraining on Old Data", Euntae Choi, Kyungmi Lee, Kiyoung Ch… https://t.co/C4UaEJxvcq
arxiv_cs_LG: Autoencoder-Based Incremental Class Learning without Retraining on Old Data. Euntae Choi, Kyungmi Lee, and Kiyoung Choi https://t.co/uJI1UVDJ32
StatsPapers: Autoencoder-Based Incremental Class Learning without Retraining on Old Data. https://t.co/lOiu12BDAa
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

###### About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 160,428 papers.

###### Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Online
###### Stats
Tracking 160,428 papers.