State-of-the-art methods for Bayesian inference on regression models with
binary responses are either computationally impractical or inaccurate in high
dimensions. To cover this gap we propose a novel variational approximation for
the posterior distribution of the coefficients in high-dimensional probit
regression. Our method leverages a representation with global and local
variables but, unlike for classical mean-field assumptions, it avoids a fully
factorized approximation, and instead assumes a factorization only for the
local variables. We prove that the resulting variational approximation belongs
to a tractable class of unified skew-normal distributions that preserves the
skewness of the actual posterior and, unlike for state-of-the-art variational
Bayes solutions, converges to the exact posterior as the number of predictors p
increases. A scalable coordinate ascent variational algorithm is proposed to
obtain the optimal parameters of the approximating densities. As we show with
both theoretical results and an application to...

more |
pdf
| html
None.

StatsPapers:
Asymptotically Exact Variational Bayes for High-Dimensional Binary Regression Models. https://t.co/EwhNYo27fo

paulportesi:
RT @StatsPapers: Asymptotically Exact Variational Bayes for High-Dimensional Binary Regression Models. https://t.co/EwhNYo27fo

ibu_hoshina:
RT @StatsPapers: Asymptotically Exact Variational Bayes for High-Dimensional Binary Regression Models. https://t.co/EwhNYo27fo

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 0

Unqiue Words: 0

Given a dataset of careers and incomes, how large a difference of income
between any pair of careers would be? Given a dataset of travel time records,
how long do we need to spend more when choosing a public transportation mode
$A$ instead of $B$ to travel? In this paper, we propose a framework that is
able to infer orders of categories as well as magnitudes of difference of real
numbers between each pair of categories using Estimation statistics framework.
Not only reporting whether an order of categories exists, but our framework
also reports the magnitude of difference of each consecutive pairs of
categories in the order. In large dataset, our framework is scalable well
compared with the existing framework. The proposed framework has been applied
to two real-world case studies: 1) ordering careers by incomes based on
information of 350,000 households living in Khon Kaen province, Thailand, and
2) ordering sectors by closing prices based on 1060 companies' closing prices
of NASDAQ stock markets between years 2000 and 2016. The...

more |
pdf
| html
StatsPapers:
A nonparametric framework for inferring orders of categorical data from category-real ordered pairs. https://t.co/ErV8UIHLmJ

Lights_Eyes:
Our pre-print paper has been archived online at ArXiv https://t.co/Lia5n6ggG2 (statistical methodology) <https://t.co/78Dv9tBrJP>. The R package of this work is at https://t.co/hwFxtXwoKS. https://t.co/J7Ojli2ifk

None.

None.

Sample Sizes : None.

Authors: 4

Total Words: 8471

Unqiue Words: 2139

The de facto standard for causal inference is the randomized controlled
trial, where one compares an manipulated group with a control group in order to
determine the effect of an intervention. However, this research design is not
always realistically possible due to pragmatic or ethical concerns. In these
situations, quasi-experimental designs may provide a solution, as these allow
for causal conclusions at the cost of additional design assumptions. In this
paper, we provide a generic framework for quasi-experimental design using
Bayesian model comparison, and we show how it can be used as an alternative to
several common research designs. We provide a theoretical motivation for a
Gaussian process based approach and demonstrate its convenient use in a number
of simulations. Finally, we apply the framework to determine the effect of
population-based thresholds for municipality funding in France, of the 2005
smoking ban in Sicily on the number of acute coronary events, and of the effect
of an alleged historical phantom border in the...

more |
pdf
| html
None.

Memoirs:
Causal inference using Bayesian non-parametric quasi-experimental design. https://t.co/3Oy0tOg0oE

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 0

Unqiue Words: 0

Empirical evidence, e.g. observed likelihood ratio, is an estimator of the
difference of the divergences between two competing models (or, model sets) and
the true generating mechanism. It is unclear how to use such empirical evidence
in scientific practice. Scientists usually want to know "how often would I get
this level of evidence". The answer to this question depends on the true
generating mechanism along with the models under consideration. In many
situations, having observed the data, we can approximate the true generating
mechanism non-parametrically by assuming far less structure than the parametric
models being compared. We use a resampling method based on the non-parametric
estimate of the true generating mechanism to estimate a confidence interval for
the empirical evidence that is robust to model misspecification. Such a
confidence interval tells us how variable the empirical evidence would be if
the experiment (or observational study) were to be replicated. In our
simulations, variability in empirical evidence...

more |
pdf
| html
None.

StatsPapers:
Assessing the uncertainty in statistical evidence with the possibility of model misspecification using a non-parametric bootstrap. https://t.co/wsSbUqjCaH

None.

None.

Sample Sizes : None.

Authors: 4

Total Words: 0

Unqiue Words: 0

This work describes the R package GET that implements global envelopes, which
can be employed for central regions of functional or multivariate data, for
graphical Monte Carlo and permutation tests where the test statistic is
multivariate or functional, and for global confidence and prediction bands.
Intrinsic graphical interpretation property is introduced for global envelopes,
and the global envelopes included in the GET package that have the property are
described and compared. Examples of different use of global envelopes and their
implementation in the GET package are presented, including global envelopes for
single and several one- or two-dimensional functions, goodness-of-fit and
permutation tests, graphical functional analysis of variance (ANOVA) and
general linear model (GLM), comparison of distributions, and confidence bands
in polynomial regression.

more |
pdf
| html
None.

StatsPapers:
GET: Global envelopes in R. https://t.co/nfhKpAQH3C

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 0

Unqiue Words: 0

Akaike's Bayesian information criterion (ABIC) has been widely used in
geophysical inversion and beyond. However, little has been done to investigate
its statistical aspects. We present an alternative derivation of the marginal
distribution of measurements, whose maximization directly leads to the
invention of ABIC by Akaike. We show that ABIC is to statistically estimate the
variance of measurements and the prior variance by maximizing the marginal
distribution of measurements. The determination of the regularization parameter
on the basis of ABIC is actually equivalent to estimating the relative
weighting factor between the variance of measurements and the prior variance
for geophysical inverse problems. We show that if the noise level of
measurements is unknown, ABIC tends to produce a substantially biased estimate
of the variance of measurements. In particular, since the prior mean is
generally unknown but arbitrarily treated as zero in geophysical inversion,
ABIC does not produce a reasonable estimate for the prior variance either.

more |
pdf
| html
None.

StatsPapers:
Akaike's Bayesian information criterion (ABIC) or not ABIC for geophysical inversion. https://t.co/pgXvoXScq5

None.

None.

Sample Sizes : None.

Authors: 1

Total Words: 0

Unqiue Words: 0

With the recent growth in data availability and complexity, and the
associated outburst of elaborate modeling approaches, model selection tools
have become a lifeline, providing objective criteria to deal with this
increasingly challenging landscape. In fact, basing predictions and inference
on a single model may be limiting if not harmful; ensemble approaches, which
combine different models, have been proposed to overcome the selection step,
and proven fruitful especially in the supervised learning framework.
Conversely, these approaches have been scantily explored in the unsupervised
setting. In this work we focus on the model-based clustering formulation, where
a plethora of mixture models, with different number of components and
parametrizations, is tipically estimated. We propose an ensemble clustering
approach that circumvents the single best model paradigm, while improving
stability and robustness of the partitions. A new density estimator, being a
convex linear combination of the density estimates in the ensemble,...

more |
pdf
| html
None.

StatsPapers:
How bettering the best? Answers via blending models and cluster formulations in density-based clustering. https://t.co/3mg8iSYpnn

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 0

Unqiue Words: 0

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

*Tracking 223,556 papers.*

Sort results based on if they are interesting or reproducible.

Interesting

Reproducible