Multi-domain learning (MDL) aims at obtaining a model with minimal average
risk across multiple domains. Our empirical motivation is automated microscopy
data, where cultured cells are imaged after being exposed to known and unknown
chemical perturbations, and each dataset displays significant experimental
bias. This paper presents a multi-domain adversarial learning approach, MuLANN,
to leverage multiple datasets with overlapping but distinct class sets, in a
semi-supervised setting. Our contributions include: i) a bound on the average-
and worst-domain risk in MDL, obtained using the H-divergence; ii) a new loss
to accommodate semi-supervised multi-domain learning and domain adaptation;
iii) the experimental validation of the approach, improving on the state of the
art on two standard image benchmarks, and a novel bioimage dataset, Cell.

more |
pdf
| html
arxiv_org:
Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y

arxivml:
"Multi-Domain Adversarial Learning",
Alice Schoenauer-Sebag, Louise Heinrich, Marc Schoenauer, Michele Sebag, Lani …
https://t.co/Bx9pb4SDxm

StatsPapers:
Multi-Domain Adversarial Learning. https://t.co/XbFg9czxNk

ThomasScialom:
RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y

jaialkdanel:
RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y

subhobrata1:
RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y

shubh_300595:
RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y

thapraveensingh:
RT @arxiv_org: Multi-Domain Adversarial Learning. https://t.co/34sEhwLFDO https://t.co/i9zSVAjM2y

Code and data of the "Multi-domain adversarial learning" paper, Schoenauer-Sebag et al., accepted at ICLR 2019

Stargazers: 1

Subscribers: 2

Subscribers: 2

Forks: 0

Open Issues: 0

Open Issues: 0

None.

Sample Sizes : None.

Authors: 6

Total Words: 11637

Unqiue Words: 3752

How well can we estimate the probability that the classification, $C(f(x))$,
predicted by a deep neural network is correct (or in the Top 5)? We consider
the case of a classification neural network trained with the KL divergence
which is assumed to generalize, as measured empirically by the test error and
test loss. We present conditional probabilities for predictions based on the
histogram of uncertainty metrics, which have a significant Bayes ratio.
Previous work in this area includes Bayesian neural networks. Our metric is
twice as predictive, based on the expected Bayes ratio, on ImageNet compared to
our best tuned implementation of Bayesian dropout~\cite{gal2016dropout}. Our
method uses just the softmax values and a stored histogram so it is essentially
free to compute, compared to many times inference cost for Bayesian dropout.

more |
pdf
| html
arxiv_org:
Empirical confidence estimates for classification by deep neural networks. https://t.co/aKcQ9XUDcZ https://t.co/aLjnikj1lL

bgoncalves:
Empirical confidence estimates for classification by deep neural networks. (arXiv:1903.09215v1 [https://t.co/dgBUOpxd8x]) https://t.co/0CPsI0x0nT

arxivml:
"Empirical confidence estimates for classification by deep neural networks",
Chris Finlay, Adam M． Oberman
https://t.co/YzJQs1E5xb

StatsPapers:
Empirical confidence estimates for classification by deep neural networks. https://t.co/Mmn7JQ00Hv

jaialkdanel:
RT @arxiv_org: Empirical confidence estimates for classification by deep neural networks. https://t.co/aKcQ9XUDcZ https://t.co/aLjnikj1lL

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 3773

Unqiue Words: 1244

Step sizes in neural network training are largely determined using
predetermined rules such as fixed learning rates and learning rate schedules,
which require user input to determine their functional form and associated
hyperparameters. Global optimization strategies to resolve these
hyperparameters are computationally expensive. Line searches are capable of
adaptively resolving learning rate schedules. However, due to discontinuities
induced by mini-batch sampling, they have largely fallen out of favor.
Notwithstanding, probabilistic line searches have recently demonstrated
viability in resolving learning rates for stochastic loss functions. This
method creates surrogates with confidence intervals, where restrictions are
placed on the rate at which the search domain can grow along a search
direction.
This paper introduces an alternative paradigm, Gradient-Only Line Searches
that are inexact (GOLS-I), as an alternative strategy to automatically resolve
learning rates in stochastic cost functions over a range of 15 orders...

more |
pdf
| html
arxivml:
"Gradient-only line searches: An Alternative to Probabilistic Line Searches",
Dominic Kafka, Daniel Wilke
https://t.co/wdmkqixs6c

daniwi79:
As mentioned in my talk Untangling Information for #MachineLearning and #Deeplearning Training for #DataScientists @nvidia @NvidiaAI #GTC19 #GTC2019 our two papers: https://t.co/wXUrgj4Wy5
and https://t.co/DdG2nrK0sa
#PyTorch and #TensorFlow code is coming soon.

daniwi79:
@JeffDean @GoogleAI @berkeley_ai Excellence in collaboration with @GoogleAI! Hope to see collaboration with #Africa and in particular #SouthAfrica growing with institutes like @UPTuks adding to the diversity of thought and understanding https://t.co/wXUrgj4Wy5
https://t.co/DdG2nrK0sa
https://t.co/ChACKG2RJ6

arxiv_cs_LG:
Gradient-only line searches: An Alternative to Probabilistic Line Searches. Dominic Kafka and Daniel Wilke https://t.co/yFeiauyCd8

Memoirs:
Gradient-only line searches: An Alternative to Probabilistic Line Searches. https://t.co/3B1BT5MavI

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 11030

Unqiue Words: 2503

Time series data in the retail world are particularly rich in terms of
dimensionality, and these dimensions can be aggregated in groups or
hierarchies. Valuable information is nested in these complex structures, which
helps to predict the aggregated time series data. From a portfolio of brands
under HUUB's monitoring, we selected two to explore their sales behaviour,
leveraging the grouping properties of their product structure. Using
statistical models, namely SARIMA, to forecast each level of the hierarchy, an
optimal combination approach was used to generate more consistent forecasts in
the higher levels. Our results show that the proposed methods can indeed
capture nested information in the more granular series, helping to improve the
forecast accuracy of the aggregated series. The Weighted Least Squares (WLS)
method surpasses all other methods proposed in the study, including the Minimum
Trace (MinT) reconciliation.

more |
pdf
| html
None.

arxivml:
"Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data",
Luis Roque, Cristina A． C． Fernandes, Tony …
https://t.co/AjrCHRjK4u

arxiv_cs_LG:
Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data. Luis Roque, Cristina A. C. Fernandes, and Tony Silva https://t.co/pd7AYTLNtC

StatsPapers:
Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data. https://t.co/789ztWauik

cris_cfernandes:
The first of a series of papers about HUUB's retail forecasting methodology: https://t.co/NSeT3MGW0w

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 6936

Unqiue Words: 1942

We consider {\em Mixed Linear Regression (MLR)}, where training data have
been generated from a mixture of distinct linear models (or clusters) and we
seek to identify the corresponding coefficient vectors. We introduce a {\em
Mixed Integer Programming (MIP)} formulation for MLR subject to regularization
constraints on the coefficient vectors. We establish that as the number of
training samples grows large, the MIP solution converges to the true
coefficient vectors in the absence of noise. Subject to slightly stronger
assumptions, we also establish that the MIP identifies the clusters from which
the training samples were generated. In the special case where training data
come from a single cluster, we establish that the corresponding MIP yields a
solution that converges to the true coefficient vector even when training data
are perturbed by (martingale difference) noise. We provide a counterexample
indicating that in the presence of noise, the MIP may fail to produce the true
coefficient vectors for more than one clusters. We also...

more |
pdf
| html
arxiv_org:
Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models. https://t.co/CO9F9BHfBB https://t.co/v0CR6Yp4jp

arxivml:
"Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models",
Taiyao Wang, Ioannis Ch． Pasch…
https://t.co/6z8gEq28PJ

StatsPapers:
Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models. https://t.co/q0t9HrAXyN

subhobrata1:
RT @arxiv_org: Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models. https://t.co/CO9F9BHfBB https://t.co/v0CR…

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 4837

Unqiue Words: 1426

The Mondrian process represents an elegant and powerful approach for space
partition modelling. However, as it restricts the partitions to be
axis-aligned, its modelling flexibility is limited. In this work, we propose a
self-consistent Binary Space Partitioning (BSP)-Tree process to generalize the
Mondrian process. The BSP-Tree process is an almost surely right continuous
Markov jump process that allows uniformly distributed oblique cuts in a
two-dimensional convex polygon. The BSP-Tree process can also be extended using
a non-uniform probability measure to generate direction differentiated cuts.
The process is also self-consistent, maintaining distributional invariance
under a restricted subdomain. We use Conditional-Sequential Monte Carlo for
inference using the tree structure as the high-dimensional variable. The
BSP-Tree process's performance on synthetic data partitioning and relational
modelling demonstrates clear inferential improvements over the standard
Mondrian process and other related methods.

more |
pdf
| html
arxivml:
"The Binary Space Partitioning-Tree Process",
Xuhui Fan, Bin Li, Scott Anthony Sisson
https://t.co/VlV2vPC9Rg

SciFi:
The Binary Space Partitioning-Tree Process. https://t.co/in21zw1lZ2

arxiv_cs_LG:
The Binary Space Partitioning-Tree Process. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/GsLn2naJM0

muktabh:
RT @arxiv_cs_LG: The Binary Space Partitioning-Tree Process. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/GsLn2naJM0

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 8656

Unqiue Words: 2321

The Binary Space Partitioning~(BSP)-Tree process is proposed to produce
flexible 2-D partition structures which are originally used as a Bayesian
nonparametric prior for relational modelling. It can hardly be applied to other
learning tasks such as regression trees because extending the BSP-Tree process
to a higher dimensional space is nontrivial. This paper is the first attempt to
extend the BSP-Tree process to a d-dimensional (d>2) space. We propose to
generate a cutting hyperplane, which is assumed to be parallel to d-2
dimensions, to cut each node in the d-dimensional BSP-tree. By designing a
subtle strategy to sample two free dimensions from d dimensions, the extended
BSP-Tree process can inherit the essential self-consistency property from the
original version. Based on the extended BSP-Tree process, an ensemble model,
which is named the BSP-Forest, is further developed for regression tasks.
Thanks to the retained self-consistency property, we can thus significantly
reduce the geometric calculations in the inference stage....

more |
pdf
| html
None.

arxivml:
"Binary Space Partitioning Forests",
Xuhui Fan, Bin Li, Scott Anthony Sisson
https://t.co/XCe37honWc

SciFi:
Binary Space Partitioning Forests. https://t.co/odppJemeyH

arxiv_cs_LG:
Binary Space Partitioning Forests. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/G83Qj2V72y

muktabh:
RT @arxiv_cs_LG: Binary Space Partitioning Forests. Xuhui Fan, Bin Li, and Scott Anthony Sisson https://t.co/G83Qj2V72y

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 0

Unqiue Words: 0

Sometimes knowing the future given the present is not enough. For sound
policy making, predicting possible futures given different user defined
scenarios can be more important. However, the workhorse for causality detection
and impulse response, the Vector Autoregression (VAR), assumes linearity and
has produced poor forecasts (Reis, 2018). Here, we introduce a vector
autoencoder nonlinear autoregression neural network (VANAR) capable of both
automatic time series feature extraction for its inputs and automatic
functional form estimation. We compare the performance of VANAR and VAR across
three tests: (1) forecasting skill, measured as n-step ahead forecast accuracy,
(2) correct detection of Granger Causality between variables, and (3) impulse
response tests on modeled trajectories subject to external shocks. These tests
were performed on datasets with different underlying dynamics: a simulated
nonlinear chaotic system, a simulated linear system, and an empirical system
using Philippine macroeconomic data. Results show that VANAR...

more |
pdf
| html
arxivml:
"Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions",
Ku…
https://t.co/o3zcfbNyRF

arxiv_cs_LG:
Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions. Kurt Izak Cabanilla and Kevin Thomas Go https://t.co/vBbq0xPC10

Memoirs:
Impulse Response and Granger Causality in Dynamical Systems with Autoencoder Nonlinear Vector Autoregressions. https://t.co/LVTabHrN5i

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 7736

Unqiue Words: 1939

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

*Tracking 100,377 papers.*

Sort results based on if they are interesting or reproducible.

Interesting

Reproducible