Top 8 Arxiv Papers Today in Machine Learning


2.029 Mikeys
#1. Multi-Domain Adversarial Learning
Alice Schoenauer-Sebag, Louise Heinrich, Marc Schoenauer, Michele Sebag, Lani F. Wu, Steve J. Altschuler
Multi-domain learning (MDL) aims at obtaining a model with minimal average risk across multiple domains. Our empirical motivation is automated microscopy data, where cultured cells are imaged after being exposed to known and unknown chemical perturbations, and each dataset displays significant experimental bias. This paper presents a multi-domain adversarial learning approach, MuLANN, to leverage multiple datasets with overlapping but distinct class sets, in a semi-supervised setting. Our contributions include: i) a bound on the average- and worst-domain risk in MDL, obtained using the H-divergence; ii) a new loss to accommodate semi-supervised multi-domain learning and domain adaptation; iii) the experimental validation of the approach, improving on the state of the art on two standard image benchmarks, and a novel bioimage dataset, Cell.
more | pdf | html
Figures
Tweets
arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
arxivml: "Multi-Domain Adversarial Learning", Alice Schoenauer-Sebag, Louise Heinrich, Marc Schoenauer, Michele Sebag, Lani … https://t.co/Bx9pb4SDxm
StatsPapers: Multi-Domain Adversarial Learning. https://t.co/XbFg9czxNk
ThomasScialom: RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
jaialkdanel: RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
subhobrata1: RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
shubh_300595: RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
thapraveensingh: RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
Github

Code and data of the "Multi-domain adversarial learning" paper, Schoenauer-Sebag et al., accepted at ICLR 2019

Repository: MuLANN
User: AltschulerWu-Lab
Language: Lua
Stargazers: 1
Subscribers: 2
Forks: 0
Open Issues: 0
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 6
Total Words: 11637
Unqiue Words: 3752

2.022 Mikeys
#2. Empirical confidence estimates for classification by deep neural networks
Chris Finlay, Adam M. Oberman
How well can we estimate the probability that the classification, $C(f(x))$, predicted by a deep neural network is correct (or in the Top 5)? We consider the case of a classification neural network trained with the KL divergence which is assumed to generalize, as measured empirically by the test error and test loss. We present conditional probabilities for predictions based on the histogram of uncertainty metrics, which have a significant Bayes ratio. Previous work in this area includes Bayesian neural networks. Our metric is twice as predictive, based on the expected Bayes ratio, on ImageNet compared to our best tuned implementation of Bayesian dropout~\cite{gal2016dropout}. Our method uses just the softmax values and a stored histogram so it is essentially free to compute, compared to many times inference cost for Bayesian dropout.
more | pdf | html
Figures
Tweets
arxiv_org: Empirical confidence estimates for classification by deep neural networks. https://t.co/aKcQ9XUDcZ https://t.co/aLjnikj1lL
bgoncalves: Empirical confidence estimates for classification by deep neural networks. (arXiv:1903.09215v1 [https://t.co/dgBUOpxd8x]) https://t.co/0CPsI0x0nT
arxivml: "Empirical confidence estimates for classification by deep neural networks", Chris Finlay, Adam M. Oberman https://t.co/YzJQs1E5xb
StatsPapers: Empirical confidence estimates for classification by deep neural networks. https://t.co/Mmn7JQ00Hv
jaialkdanel: RT @arxiv_org: Empirical confidence estimates for classification by deep neural networks. https://t.co/aKcQ9XUDcZ https://t.co/aLjnikj1lL
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 3773
Unqiue Words: 1244

2.021 Mikeys
#3. Gradient-only line searches: An Alternative to Probabilistic Line Searches
Dominic Kafka, Daniel Wilke
Step sizes in neural network training are largely determined using predetermined rules such as fixed learning rates and learning rate schedules, which require user input to determine their functional form and associated hyperparameters. Global optimization strategies to resolve these hyperparameters are computationally expensive. Line searches are capable of adaptively resolving learning rate schedules. However, due to discontinuities induced by mini-batch sampling, they have largely fallen out of favor. Notwithstanding, probabilistic line searches have recently demonstrated viability in resolving learning rates for stochastic loss functions. This method creates surrogates with confidence intervals, where restrictions are placed on the rate at which the search domain can grow along a search direction. This paper introduces an alternative paradigm, Gradient-Only Line Searches that are inexact (GOLS-I), as an alternative strategy to automatically resolve learning rates in stochastic cost functions over a range of 15 orders...
more | pdf | html
Figures
Tweets
arxivml: "Gradient-only line searches: An Alternative to Probabilistic Line Searches", Dominic Kafka, Daniel Wilke https://t.co/wdmkqixs6c
daniwi79: As mentioned in my talk Untangling Information for #MachineLearning and #Deeplearning Training for #DataScientists @nvidia @NvidiaAI #GTC19 #GTC2019 our two papers: https://t.co/wXUrgj4Wy5 and https://t.co/DdG2nrK0sa #PyTorch and #TensorFlow code is coming soon.
daniwi79: @JeffDean @GoogleAI @berkeley_ai Excellence in collaboration with @GoogleAI! Hope to see collaboration with #Africa and in particular #SouthAfrica growing with institutes like @UPTuks adding to the diversity of thought and understanding https://t.co/wXUrgj4Wy5 https://t.co/DdG2nrK0sa https://t.co/ChACKG2RJ6
arxiv_cs_LG: Gradient-only line searches: An Alternative to Probabilistic Line Searches. Dominic Kafka and Daniel Wilke https://t.co/yFeiauyCd8
Memoirs: Gradient-only line searches: An Alternative to Probabilistic Line Searches. https://t.co/3B1BT5MavI
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 11030
Unqiue Words: 2503

2.016 Mikeys
#4. Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data
Luis Roque, Cristina A. C. Fernandes, Tony Silva
Time series data in the retail world are particularly rich in terms of dimensionality, and these dimensions can be aggregated in groups or hierarchies. Valuable information is nested in these complex structures, which helps to predict the aggregated time series data. From a portfolio of brands under HUUB's monitoring, we selected two to explore their sales behaviour, leveraging the grouping properties of their product structure. Using statistical models, namely SARIMA, to forecast each level of the hierarchy, an optimal combination approach was used to generate more consistent forecasts in the higher levels. Our results show that the proposed methods can indeed capture nested information in the more granular series, helping to improve the forecast accuracy of the aggregated series. The Weighted Least Squares (WLS) method surpasses all other methods proposed in the study, including the Minimum Trace (MinT) reconciliation.
more | pdf | html
Figures
None.
Tweets
arxivml: "Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data", Luis Roque, Cristina A. C. Fernandes, Tony … https://t.co/AjrCHRjK4u
arxiv_cs_LG: Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data. Luis Roque, Cristina A. C. Fernandes, and Tony Silva https://t.co/pd7AYTLNtC
StatsPapers: Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data. https://t.co/789ztWauik
cris_cfernandes: The first of a series of papers about HUUB's retail forecasting methodology: https://t.co/NSeT3MGW0w
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 6936
Unqiue Words: 1942

2.015 Mikeys
#5. Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models
Taiyao Wang, Ioannis Ch. Paschalidis
We consider {\em Mixed Linear Regression (MLR)}, where training data have been generated from a mixture of distinct linear models (or clusters) and we seek to identify the corresponding coefficient vectors. We introduce a {\em Mixed Integer Programming (MIP)} formulation for MLR subject to regularization constraints on the coefficient vectors. We establish that as the number of training samples grows large, the MIP solution converges to the true coefficient vectors in the absence of noise. Subject to slightly stronger assumptions, we also establish that the MIP identifies the clusters from which the training samples were generated. In the special case where training data come from a single cluster, we establish that the corresponding MIP yields a solution that converges to the true coefficient vector even when training data are perturbed by (martingale difference) noise. We provide a counterexample indicating that in the presence of noise, the MIP may fail to produce the true coefficient vectors for more than one clusters. We also...
more | pdf | html
Figures
Tweets
arxiv_org: Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models. https://t.co/CO9F9BHfBB https://t.co/v0CR6Yp4jp
arxivml: "Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models", Taiyao Wang, Ioannis Ch. Pasch… https://t.co/6z8gEq28PJ
StatsPapers: Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models. https://t.co/q0t9HrAXyN
subhobrata1: RT @arxiv_org: Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models. https://t.co/CO9F9BHfBB https://t.co/v0CR…
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 4837
Unqiue Words: 1426

2.014 Mikeys
#6. The Binary Space Partitioning-Tree Process
Xuhui Fan, Bin Li, Scott Anthony Sisson
The Mondrian process represents an elegant and powerful approach for space partition modelling. However, as it restricts the partitions to be axis-aligned, its modelling flexibility is limited. In this work, we propose a self-consistent Binary Space Partitioning (BSP)-Tree process to generalize the Mondrian process. The BSP-Tree process is an almost surely right continuous Markov jump process that allows uniformly distributed oblique cuts in a two-dimensional convex polygon. The BSP-Tree process can also be extended using a non-uniform probability measure to generate direction differentiated cuts. The process is also self-consistent, maintaining distributional invariance under a restricted subdomain. We use Conditional-Sequential Monte Carlo for inference using the tree structure as the high-dimensional variable. The BSP-Tree process's performance on synthetic data partitioning and relational modelling demonstrates clear inferential improvements over the standard Mondrian process and other related methods.
more | pdf | html
Figures
Tweets
arxivml: "The Binary Space Partitioning-Tree Process", Xuhui Fan, Bin Li, Scott Anthony Sisson https://t.co/VlV2vPC9Rg
SciFi: The Binary Space Partitioning-Tree Process. https://t.co/in21zw1lZ2
arxiv_cs_LG: The Binary Space Partitioning-Tree Process. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/GsLn2naJM0
muktabh: RT @arxiv_cs_LG: The Binary Space Partitioning-Tree Process. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/GsLn2naJM0
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 8656
Unqiue Words: 2321

2.014 Mikeys
#7. Binary Space Partitioning Forests
Xuhui Fan, Bin Li, Scott Anthony Sisson
The Binary Space Partitioning~(BSP)-Tree process is proposed to produce flexible 2-D partition structures which are originally used as a Bayesian nonparametric prior for relational modelling. It can hardly be applied to other learning tasks such as regression trees because extending the BSP-Tree process to a higher dimensional space is nontrivial. This paper is the first attempt to extend the BSP-Tree process to a d-dimensional (d>2) space. We propose to generate a cutting hyperplane, which is assumed to be parallel to d-2 dimensions, to cut each node in the d-dimensional BSP-tree. By designing a subtle strategy to sample two free dimensions from d dimensions, the extended BSP-Tree process can inherit the essential self-consistency property from the original version. Based on the extended BSP-Tree process, an ensemble model, which is named the BSP-Forest, is further developed for regression tasks. Thanks to the retained self-consistency property, we can thus significantly reduce the geometric calculations in the inference stage....
more | pdf | html
Figures
None.
Tweets
arxivml: "Binary Space Partitioning Forests", Xuhui Fan, Bin Li, Scott Anthony Sisson https://t.co/XCe37honWc
SciFi: Binary Space Partitioning Forests. https://t.co/odppJemeyH
arxiv_cs_LG: Binary Space Partitioning Forests. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/G83Qj2V72y
muktabh: RT @arxiv_cs_LG: Binary Space Partitioning Forests. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/G83Qj2V72y
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

2.011 Mikeys
#8. Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions
Kurt Izak Cabanilla, Kevin Thomas Go
Sometimes knowing the future given the present is not enough. For sound policy making, predicting possible futures given different user defined scenarios can be more important. However, the workhorse for causality detection and impulse response, the Vector Autoregression (VAR), assumes linearity and has produced poor forecasts (Reis, 2018). Here, we introduce a vector autoencoder nonlinear autoregression neural network (VANAR) capable of both automatic time series feature extraction for its inputs and automatic functional form estimation. We compare the performance of VANAR and VAR across three tests: (1) forecasting skill, measured as n-step ahead forecast accuracy, (2) correct detection of Granger Causality between variables, and (3) impulse response tests on modeled trajectories subject to external shocks. These tests were performed on datasets with different underlying dynamics: a simulated nonlinear chaotic system, a simulated linear system, and an empirical system using Philippine macroeconomic data. Results show that VANAR...
more | pdf | html
Figures
Tweets
arxivml: "Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions", Ku… https://t.co/o3zcfbNyRF
arxiv_cs_LG: Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions. Kurt Izak Cabanilla and Kevin Thomas Go https://t.co/vBbq0xPC10
Memoirs: Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions. https://t.co/LVTabHrN5i
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 7736
Unqiue Words: 1939

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 100,377 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 100,377 papers.