### Top 8 Arxiv Papers Today in Machine Learning

###### Alice Schoenauer-Sebag, Louise Heinrich, Marc Schoenauer, Michele Sebag, Lani F. Wu, Steve J. Altschuler
Multi-domain learning (MDL) aims at obtaining a model with minimal average risk across multiple domains. Our empirical motivation is automated microscopy data, where cultured cells are imaged after being exposed to known and unknown chemical perturbations, and each dataset displays significant experimental bias. This paper presents a multi-domain adversarial learning approach, MuLANN, to leverage multiple datasets with overlapping but distinct class sets, in a semi-supervised setting. Our contributions include: i) a bound on the average- and worst-domain risk in MDL, obtained using the H-divergence; ii) a new loss to accommodate semi-supervised multi-domain learning and domain adaptation; iii) the experimental validation of the approach, improving on the state of the art on two standard image benchmarks, and a novel bioimage dataset, Cell.
more | pdf | html
###### Tweets
arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
arxivml: "Multi-Domain Adversarial Learning", Alice Schoenauer-Sebag, Louise Heinrich, Marc Schoenauer, Michele Sebag, Lani … https://t.co/Bx9pb4SDxm
ThomasScialom: RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
jaialkdanel: RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
subhobrata1: RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
shubh_300595: RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
thapraveensingh: RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y
###### Github

Code and data of the "Multi-domain adversarial learning" paper, Schoenauer-Sebag et al., accepted at ICLR 2019

Repository: MuLANN
User: AltschulerWu-Lab
Language: Lua
Stargazers: 1
Subscribers: 2
Forks: 0
Open Issues: 0
None.
###### Other stats
Sample Sizes : None.
Authors: 6
Total Words: 11637
Unqiue Words: 3752

##### #2. Empirical confidence estimates for classification by deep neural networks
###### Chris Finlay, Adam M. Oberman
How well can we estimate the probability that the classification, $C(f(x))$, predicted by a deep neural network is correct (or in the Top 5)? We consider the case of a classification neural network trained with the KL divergence which is assumed to generalize, as measured empirically by the test error and test loss. We present conditional probabilities for predictions based on the histogram of uncertainty metrics, which have a significant Bayes ratio. Previous work in this area includes Bayesian neural networks. Our metric is twice as predictive, based on the expected Bayes ratio, on ImageNet compared to our best tuned implementation of Bayesian dropout~\cite{gal2016dropout}. Our method uses just the softmax values and a stored histogram so it is essentially free to compute, compared to many times inference cost for Bayesian dropout.
more | pdf | html
###### Tweets
arxiv_org: Empirical confidence estimates for classification by deep neural networks. https://t.co/aKcQ9XUDcZ https://t.co/aLjnikj1lL
bgoncalves: Empirical confidence estimates for classification by deep neural networks. (arXiv:1903.09215v1 [https://t.co/dgBUOpxd8x]) https://t.co/0CPsI0x0nT
arxivml: "Empirical confidence estimates for classification by deep neural networks", Chris Finlay, Adam M． Oberman https://t.co/YzJQs1E5xb
StatsPapers: Empirical confidence estimates for classification by deep neural networks. https://t.co/Mmn7JQ00Hv
jaialkdanel: RT @arxiv_org: Empirical confidence estimates for classification by deep neural networks. https://t.co/aKcQ9XUDcZ https://t.co/aLjnikj1lL
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 2
Total Words: 3773
Unqiue Words: 1244

##### #3. Gradient-only line searches: An Alternative to Probabilistic Line Searches
###### Dominic Kafka, Daniel Wilke
Step sizes in neural network training are largely determined using predetermined rules such as fixed learning rates and learning rate schedules, which require user input to determine their functional form and associated hyperparameters. Global optimization strategies to resolve these hyperparameters are computationally expensive. Line searches are capable of adaptively resolving learning rate schedules. However, due to discontinuities induced by mini-batch sampling, they have largely fallen out of favor. Notwithstanding, probabilistic line searches have recently demonstrated viability in resolving learning rates for stochastic loss functions. This method creates surrogates with confidence intervals, where restrictions are placed on the rate at which the search domain can grow along a search direction. This paper introduces an alternative paradigm, Gradient-Only Line Searches that are inexact (GOLS-I), as an alternative strategy to automatically resolve learning rates in stochastic cost functions over a range of 15 orders...
more | pdf | html
###### Tweets
arxivml: "Gradient-only line searches: An Alternative to Probabilistic Line Searches", Dominic Kafka, Daniel Wilke https://t.co/wdmkqixs6c
daniwi79: As mentioned in my talk Untangling Information for #MachineLearning and #Deeplearning Training for #DataScientists @nvidia @NvidiaAI #GTC19 #GTC2019 our two papers: https://t.co/wXUrgj4Wy5 and https://t.co/DdG2nrK0sa #PyTorch and #TensorFlow code is coming soon.
daniwi79: @JeffDean @GoogleAI @berkeley_ai Excellence in collaboration with @GoogleAI! Hope to see collaboration with #Africa and in particular #SouthAfrica growing with institutes like @UPTuks adding to the diversity of thought and understanding https://t.co/wXUrgj4Wy5 https://t.co/DdG2nrK0sa https://t.co/ChACKG2RJ6
arxiv_cs_LG: Gradient-only line searches: An Alternative to Probabilistic Line Searches. Dominic Kafka and Daniel Wilke https://t.co/yFeiauyCd8
Memoirs: Gradient-only line searches: An Alternative to Probabilistic Line Searches. https://t.co/3B1BT5MavI
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 2
Total Words: 11030
Unqiue Words: 2503

##### #4. Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data
###### Luis Roque, Cristina A. C. Fernandes, Tony Silva
Time series data in the retail world are particularly rich in terms of dimensionality, and these dimensions can be aggregated in groups or hierarchies. Valuable information is nested in these complex structures, which helps to predict the aggregated time series data. From a portfolio of brands under HUUB's monitoring, we selected two to explore their sales behaviour, leveraging the grouping properties of their product structure. Using statistical models, namely SARIMA, to forecast each level of the hierarchy, an optimal combination approach was used to generate more consistent forecasts in the higher levels. Our results show that the proposed methods can indeed capture nested information in the more granular series, helping to improve the forecast accuracy of the aggregated series. The Weighted Least Squares (WLS) method surpasses all other methods proposed in the study, including the Minimum Trace (MinT) reconciliation.
more | pdf | html
None.
###### Tweets
arxivml: "Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data", Luis Roque, Cristina A． C． Fernandes, Tony … https://t.co/AjrCHRjK4u
arxiv_cs_LG: Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data. Luis Roque, Cristina A. C. Fernandes, and Tony Silva https://t.co/pd7AYTLNtC
StatsPapers: Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data. https://t.co/789ztWauik
cris_cfernandes: The first of a series of papers about HUUB's retail forecasting methodology: https://t.co/NSeT3MGW0w
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 6936
Unqiue Words: 1942

##### #5. Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models
###### Taiyao Wang, Ioannis Ch. Paschalidis
We consider {\em Mixed Linear Regression (MLR)}, where training data have been generated from a mixture of distinct linear models (or clusters) and we seek to identify the corresponding coefficient vectors. We introduce a {\em Mixed Integer Programming (MIP)} formulation for MLR subject to regularization constraints on the coefficient vectors. We establish that as the number of training samples grows large, the MIP solution converges to the true coefficient vectors in the absence of noise. Subject to slightly stronger assumptions, we also establish that the MIP identifies the clusters from which the training samples were generated. In the special case where training data come from a single cluster, we establish that the corresponding MIP yields a solution that converges to the true coefficient vector even when training data are perturbed by (martingale difference) noise. We provide a counterexample indicating that in the presence of noise, the MIP may fail to produce the true coefficient vectors for more than one clusters. We also...
more | pdf | html
###### Tweets
arxiv_org: Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models. https://t.co/CO9F9BHfBB https://t.co/v0CR6Yp4jp
arxivml: "Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models", Taiyao Wang, Ioannis Ch． Pasch… https://t.co/6z8gEq28PJ
StatsPapers: Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models. https://t.co/q0t9HrAXyN
subhobrata1: RT @arxiv_org: Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models. https://t.co/CO9F9BHfBB https://t.co/v0CR…
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 2
Total Words: 4837
Unqiue Words: 1426

##### #6. The Binary Space Partitioning-Tree Process
###### Xuhui Fan, Bin Li, Scott Anthony Sisson
The Mondrian process represents an elegant and powerful approach for space partition modelling. However, as it restricts the partitions to be axis-aligned, its modelling flexibility is limited. In this work, we propose a self-consistent Binary Space Partitioning (BSP)-Tree process to generalize the Mondrian process. The BSP-Tree process is an almost surely right continuous Markov jump process that allows uniformly distributed oblique cuts in a two-dimensional convex polygon. The BSP-Tree process can also be extended using a non-uniform probability measure to generate direction differentiated cuts. The process is also self-consistent, maintaining distributional invariance under a restricted subdomain. We use Conditional-Sequential Monte Carlo for inference using the tree structure as the high-dimensional variable. The BSP-Tree process's performance on synthetic data partitioning and relational modelling demonstrates clear inferential improvements over the standard Mondrian process and other related methods.
more | pdf | html
###### Tweets
arxivml: "The Binary Space Partitioning-Tree Process", Xuhui Fan, Bin Li, Scott Anthony Sisson https://t.co/VlV2vPC9Rg
SciFi: The Binary Space Partitioning-Tree Process. https://t.co/in21zw1lZ2
arxiv_cs_LG: The Binary Space Partitioning-Tree Process. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/GsLn2naJM0
muktabh: RT @arxiv_cs_LG: The Binary Space Partitioning-Tree Process. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/GsLn2naJM0
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 8656
Unqiue Words: 2321

##### #7. Binary Space Partitioning Forests
###### Xuhui Fan, Bin Li, Scott Anthony Sisson
The Binary Space Partitioning~(BSP)-Tree process is proposed to produce flexible 2-D partition structures which are originally used as a Bayesian nonparametric prior for relational modelling. It can hardly be applied to other learning tasks such as regression trees because extending the BSP-Tree process to a higher dimensional space is nontrivial. This paper is the first attempt to extend the BSP-Tree process to a d-dimensional (d>2) space. We propose to generate a cutting hyperplane, which is assumed to be parallel to d-2 dimensions, to cut each node in the d-dimensional BSP-tree. By designing a subtle strategy to sample two free dimensions from d dimensions, the extended BSP-Tree process can inherit the essential self-consistency property from the original version. Based on the extended BSP-Tree process, an ensemble model, which is named the BSP-Forest, is further developed for regression tasks. Thanks to the retained self-consistency property, we can thus significantly reduce the geometric calculations in the inference stage....
more | pdf | html
None.
###### Tweets
arxivml: "Binary Space Partitioning Forests", Xuhui Fan, Bin Li, Scott Anthony Sisson https://t.co/XCe37honWc
SciFi: Binary Space Partitioning Forests. https://t.co/odppJemeyH
arxiv_cs_LG: Binary Space Partitioning Forests. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/G83Qj2V72y
muktabh: RT @arxiv_cs_LG: Binary Space Partitioning Forests. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/G83Qj2V72y
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

##### #8. Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions
###### Kurt Izak Cabanilla, Kevin Thomas Go
Sometimes knowing the future given the present is not enough. For sound policy making, predicting possible futures given different user defined scenarios can be more important. However, the workhorse for causality detection and impulse response, the Vector Autoregression (VAR), assumes linearity and has produced poor forecasts (Reis, 2018). Here, we introduce a vector autoencoder nonlinear autoregression neural network (VANAR) capable of both automatic time series feature extraction for its inputs and automatic functional form estimation. We compare the performance of VANAR and VAR across three tests: (1) forecasting skill, measured as n-step ahead forecast accuracy, (2) correct detection of Granger Causality between variables, and (3) impulse response tests on modeled trajectories subject to external shocks. These tests were performed on datasets with different underlying dynamics: a simulated nonlinear chaotic system, a simulated linear system, and an empirical system using Philippine macroeconomic data. Results show that VANAR...
more | pdf | html
###### Tweets
arxivml: "Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions", Ku… https://t.co/o3zcfbNyRF
arxiv_cs_LG: Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions. Kurt Izak Cabanilla and Kevin Thomas Go https://t.co/vBbq0xPC10
Memoirs: Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions. https://t.co/LVTabHrN5i
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 2
Total Words: 7736
Unqiue Words: 1939

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 100,377 papers.

###### Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Online
###### Stats
Tracking 100,377 papers.