Top 7 Arxiv Papers Today in Methodology


2.131 Mikeys
#1. Asymptotically Exact Variational Bayes for High-Dimensional Binary Regression Models
Augusto Fasano, Daniele Durante, Giacomo Zanella
State-of-the-art methods for Bayesian inference on regression models with binary responses are either computationally impractical or inaccurate in high dimensions. To cover this gap we propose a novel variational approximation for the posterior distribution of the coefficients in high-dimensional probit regression. Our method leverages a representation with global and local variables but, unlike for classical mean-field assumptions, it avoids a fully factorized approximation, and instead assumes a factorization only for the local variables. We prove that the resulting variational approximation belongs to a tractable class of unified skew-normal distributions that preserves the skewness of the actual posterior and, unlike for state-of-the-art variational Bayes solutions, converges to the exact posterior as the number of predictors p increases. A scalable coordinate ascent variational algorithm is proposed to obtain the optimal parameters of the approximating densities. As we show with both theoretical results and an application to...
more | pdf | html
Figures
None.
Tweets
StatsPapers: Asymptotically Exact Variational Bayes for High-Dimensional Binary Regression Models. https://t.co/EwhNYo27fo
paulportesi: RT @StatsPapers: Asymptotically Exact Variational Bayes for High-Dimensional Binary Regression Models. https://t.co/EwhNYo27fo
ibu_hoshina: RT @StatsPapers: Asymptotically Exact Variational Bayes for High-Dimensional Binary Regression Models. https://t.co/EwhNYo27fo
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

2.123 Mikeys
#2. A nonparametric framework for inferring orders of categorical data from category-real ordered pairs
Chainarong Amornbunchornvej, Navaporn Surasvadi, Anon Plangprasopchok, Suttipong Thajchayapong
Given a dataset of careers and incomes, how large a difference of income between any pair of careers would be? Given a dataset of travel time records, how long do we need to spend more when choosing a public transportation mode $A$ instead of $B$ to travel? In this paper, we propose a framework that is able to infer orders of categories as well as magnitudes of difference of real numbers between each pair of categories using Estimation statistics framework. Not only reporting whether an order of categories exists, but our framework also reports the magnitude of difference of each consecutive pairs of categories in the order. In large dataset, our framework is scalable well compared with the existing framework. The proposed framework has been applied to two real-world case studies: 1) ordering careers by incomes based on information of 350,000 households living in Khon Kaen province, Thailand, and 2) ordering sectors by closing prices based on 1060 companies' closing prices of NASDAQ stock markets between years 2000 and 2016. The...
more | pdf | html
Figures
Tweets
StatsPapers: A nonparametric framework for inferring orders of categorical data from category-real ordered pairs. https://t.co/ErV8UIHLmJ
Lights_Eyes: Our pre-print paper has been archived online at ArXiv https://t.co/Lia5n6ggG2 (statistical methodology) <https://t.co/78Dv9tBrJP>. The R package of this work is at https://t.co/hwFxtXwoKS. https://t.co/J7Ojli2ifk
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 8471
Unqiue Words: 2139

2.032 Mikeys
#3. Causal inference using Bayesian non-parametric quasi-experimental design
Max Hinne, Marcel A. J. van Gerven, Luca Ambrogioni
The de facto standard for causal inference is the randomized controlled trial, where one compares an manipulated group with a control group in order to determine the effect of an intervention. However, this research design is not always realistically possible due to pragmatic or ethical concerns. In these situations, quasi-experimental designs may provide a solution, as these allow for causal conclusions at the cost of additional design assumptions. In this paper, we provide a generic framework for quasi-experimental design using Bayesian model comparison, and we show how it can be used as an alternative to several common research designs. We provide a theoretical motivation for a Gaussian process based approach and demonstrate its convenient use in a number of simulations. Finally, we apply the framework to determine the effect of population-based thresholds for municipality funding in France, of the 2005 smoking ban in Sicily on the number of acute coronary events, and of the effect of an alleged historical phantom border in the...
more | pdf | html
Figures
None.
Tweets
Memoirs: Causal inference using Bayesian non-parametric quasi-experimental design. https://t.co/3Oy0tOg0oE
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

2.028 Mikeys
#4. Assessing the uncertainty in statistical evidence with the possibility of model misspecification using a non-parametric bootstrap
Mark L. Taper, Subhash R Lele, José-Miguel Ponciano, Brian Dennis
Empirical evidence, e.g. observed likelihood ratio, is an estimator of the difference of the divergences between two competing models (or, model sets) and the true generating mechanism. It is unclear how to use such empirical evidence in scientific practice. Scientists usually want to know "how often would I get this level of evidence". The answer to this question depends on the true generating mechanism along with the models under consideration. In many situations, having observed the data, we can approximate the true generating mechanism non-parametrically by assuming far less structure than the parametric models being compared. We use a resampling method based on the non-parametric estimate of the true generating mechanism to estimate a confidence interval for the empirical evidence that is robust to model misspecification. Such a confidence interval tells us how variable the empirical evidence would be if the experiment (or observational study) were to be replicated. In our simulations, variability in empirical evidence...
more | pdf | html
Figures
None.
Tweets
StatsPapers: Assessing the uncertainty in statistical evidence with the possibility of model misspecification using a non-parametric bootstrap. https://t.co/wsSbUqjCaH
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 0
Unqiue Words: 0

2.028 Mikeys
#5. GET: Global envelopes in R
Mari Myllymäki, Tomáš Mrkvička
This work describes the R package GET that implements global envelopes, which can be employed for central regions of functional or multivariate data, for graphical Monte Carlo and permutation tests where the test statistic is multivariate or functional, and for global confidence and prediction bands. Intrinsic graphical interpretation property is introduced for global envelopes, and the global envelopes included in the GET package that have the property are described and compared. Examples of different use of global envelopes and their implementation in the GET package are presented, including global envelopes for single and several one- or two-dimensional functions, goodness-of-fit and permutation tests, graphical functional analysis of variance (ANOVA) and general linear model (GLM), comparison of distributions, and confidence bands in polynomial regression.
more | pdf | html
Figures
None.
Tweets
StatsPapers: GET: Global envelopes in R. https://t.co/nfhKpAQH3C
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

2.028 Mikeys
#6. Akaike's Bayesian information criterion (ABIC) or not ABIC for geophysical inversion
Peiliang Xu
Akaike's Bayesian information criterion (ABIC) has been widely used in geophysical inversion and beyond. However, little has been done to investigate its statistical aspects. We present an alternative derivation of the marginal distribution of measurements, whose maximization directly leads to the invention of ABIC by Akaike. We show that ABIC is to statistically estimate the variance of measurements and the prior variance by maximizing the marginal distribution of measurements. The determination of the regularization parameter on the basis of ABIC is actually equivalent to estimating the relative weighting factor between the variance of measurements and the prior variance for geophysical inverse problems. We show that if the noise level of measurements is unknown, ABIC tends to produce a substantially biased estimate of the variance of measurements. In particular, since the prior mean is generally unknown but arbitrarily treated as zero in geophysical inversion, ABIC does not produce a reasonable estimate for the prior variance either.
more | pdf | html
Figures
None.
Tweets
StatsPapers: Akaike's Bayesian information criterion (ABIC) or not ABIC for geophysical inversion. https://t.co/pgXvoXScq5
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 0
Unqiue Words: 0

2.028 Mikeys
#7. How bettering the best? Answers via blending models and cluster formulations in density-based clustering
Alessandro Casa, Luca Scrucca, Giovanna Menardi
With the recent growth in data availability and complexity, and the associated outburst of elaborate modeling approaches, model selection tools have become a lifeline, providing objective criteria to deal with this increasingly challenging landscape. In fact, basing predictions and inference on a single model may be limiting if not harmful; ensemble approaches, which combine different models, have been proposed to overcome the selection step, and proven fruitful especially in the supervised learning framework. Conversely, these approaches have been scantily explored in the unsupervised setting. In this work we focus on the model-based clustering formulation, where a plethora of mixture models, with different number of components and parametrizations, is tipically estimated. We propose an ensemble clustering approach that circumvents the single best model paradigm, while improving stability and robustness of the partitions. A new density estimator, being a convex linear combination of the density estimates in the ensemble,...
more | pdf | html
Figures
None.
Tweets
StatsPapers: How bettering the best? Answers via blending models and cluster formulations in density-based clustering. https://t.co/3mg8iSYpnn
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 223,556 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 223,556 papers.