Top 10 Arxiv Papers Today in Statistics Theory


2.023 Mikeys
#1. On compatibility/incompatibility of two discrete probability distributions in the presence of incomplete specification
Indranil Ghosh, N. Balakrishnan
Conditional specification of distributions is a developing area with many applications. In the finite discrete case, a variety of compatible conditions can be derived. In this paper, we propose an alternative approach to study the compatibility of two conditional probability distributions under the finite discrete set up. A technique based on rank-based criterion is shown to be particularly convenient for identifying compatible distributions corresponding to complete conditional specification, including the case with zeros. The proposed methods are finally illustrated with several examples.
more | pdf | html
Figures
None.
Tweets
ArtofWarm: On compatibility/incompatibility of two discrete probability distributions in the presence of incomplete specification https://t.co/kKbyQeD6lk https://t.co/PFcd3h8sUO
mathSTb: Indranil Ghosh, N.Balakrishnan : On compatibility/incompatibility of two discrete probability distributions in the presence of incomplete specification https://t.co/0kuBfpofXS https://t.co/5E7fT1QrtR
StatsPapers: On compatibility/incompatibility of two discrete probability distributions in the presence of incomplete specification. https://t.co/j9ppBc5xZ0
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

2.022 Mikeys
#2. Multivariate Rank-based Distribution-free Nonparametric Testing using Measure Transportation
Nabarun Deb, Bodhisattva Sen
In this paper, we propose a general framework for distribution-free nonparametric testing in multi-dimensions, based on a notion of multivariate ranks defined using the theory of measure transportation. Unlike other existing proposals in the literature, these multivariate ranks share a number of useful properties with the usual one-dimensional ranks; most importantly, these ranks are distribution-free. This crucial observation allows us to design nonparametric tests that are exactly distribution-free under the null hypothesis. We demonstrate the applicability of this approach by constructing exact distribution-free tests for two classical nonparametric problems: (i) testing for mutual independence between random vectors, and (ii) testing for the equality of multivariate distributions. In particular, we propose (multivariate) rank versions of distance covariance (Sz\'ekely et al., 2007) and energy statistic (Sz\'ekely and Rizzo, 2013) for testing scenarios (i) and (ii) respectively. In both these problems, we derive the asymptotic...
more | pdf | html
Figures
None.
Tweets
StatsPapers: Multivariate Rank-based Distribution-free Nonparametric Testing using Measure Transportation. https://t.co/ygK1DuSkA0
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

2.022 Mikeys
#3. Collective sampling through a Metropolis-Hastings like method: kinetic theory and numerical experiments
Grégoire Clarté, Antoine Diez
The classical Metropolis-Hastings algorithm provides a simple method to construct a Markov Chain with an arbitrary stationary measure. In order to implement Monte Carlo methods, an elementary approach would be to duplicate this algorithm as many times as desired. Following the ideas of Population Monte Carlo methods, we propose to take advantage of the number of duplicates to increase the efficiency of the naive approach. Within this framework, each chain is seen as the evolution of a single particle which interacts with the others. In this article, we propose a simple and efficient interaction mechanism and an analytical framework which ensures that the particles are asymptotically independent and identically distributed according to an arbitrary target law. This approach is also supported by numerical simulations showing better convergence properties compared to the classical Metropolis-Hastings algorithm.
more | pdf | html
Figures
None.
Tweets
StatsPapers: Collective sampling through a Metropolis-Hastings like method: kinetic theory and numerical experiments. https://t.co/F1JGK5AFKj
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

2.022 Mikeys
#4. Generalized Resilience and Robust Statistics
Banghua Zhu, Jiantao Jiao, Jacob Steinhardt
Robust statistics traditionally focuses on outliers, or perturbations in total variation distance. However, a dataset could be corrupted in many other ways, such as systematic measurement errors and missing covariates. We generalize the robust statistics approach to consider perturbations under any Wasserstein distance, and show that robust estimation is possible whenever a distribution's population statistics are robust under a certain family of friendly perturbations. This generalizes a property called resilience previously employed in the special case of mean estimation with outliers. We justify the generalized resilience property by showing that it holds under moment or hypercontractive conditions. Even in the total variation case, these subsume conditions in the literature for mean estimation, regression, and covariance estimation; the resulting analysis simplifies and sometimes improves these known results in both population limit and finite-sample rate. Our robust estimators are based on minimum distance (MD) functionals...
more | pdf | html
Figures
None.
Tweets
StatsPapers: Generalized Resilience and Robust Statistics. https://t.co/JlWRIBl1Op
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

2.011 Mikeys
#5. Nonparametric estimation of conditional cure models for heavy-tailed distributions and under insufficient follow-up
Mikael Escobar-Bach, Ingrid Van Keilegom
When analyzing time-to-event data, it often happens that some subjects do not experience the event of interest. Survival models that take this feature into account (called `cure models') have been developed in the presence of covariates. However, the current literature on nonparametric cure models with covariates cannot be applied when the follow-up is insufficient, i.e., when the right endpoint of the support of the censoring time is strictly smaller than that of the survival time of the susceptible subjects. In this paper we attempt to fill this gap in the literature by proposing new estimators of the conditional cure rate and the conditional survival function using extrapolation techniques coming from extreme value theory. We establish the asymptotic normality of the proposed estimators, and show how the estimators work for small samples by means of a simulation study. We also illustrate their practical applicability through the analysis of data on the survival of colon cancer patients.
more | pdf | html
Figures
None.
Tweets
mathSTb: Mikael Escobar-Bach, Ingrid Van Keilegom : Nonparametric estimation of conditional cure models for heavy-tailed distributions and under insufficient follow-up https://t.co/N7dVzvSWNN https://t.co/V7E9XaXrqJ
StatsPapers: Nonparametric estimation of conditional cure models for heavy-tailed distributions and under insufficient follow-up. https://t.co/Y2dFs1Vt4D
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 11068
Unqiue Words: 2242

2.011 Mikeys
#6. Inference on the change point with the jump size near the boundary of the region of detectability in high dimensional time series models
Abhishek Kaul, Venkata K Jandhyala, Stergios B Fotopoulos
We develop a projected least squares estimator for the change point parameter in a high dimensional time series model with a potential change point. Importantly we work under the setup where the jump size may be near the boundary of the region of detectability. The proposed methodology yields an optimal rate of convergence despite high dimensionality of the assumed model and a potentially diminishing jump size. The limiting distribution of this estimate is derived, thereby allowing construction of a confidence interval for the location of the change point. A secondary near optimal estimate is proposed which is required for the implementation of the optimal projected least squares estimate. The prestep estimation procedure is designed to also agnostically detect the case where no change point exists, thereby removing the need to pretest for the existence of a change point for the implementation of the inference methodology. Our results are presented under a general positive definite spatial dependence setup, assuming no special...
more | pdf | html
Figures
Tweets
mathSTb: Abhishek Kaul, Venkata K Jandhyala, Stergios B Fotopoulos : Inference on the change point with the jump size near the boundary of the region of detectability in high dimensional time series models https://t.co/MqygDlBmHu https://t.co/hiXprdQNMw
StatsPapers: Inference on the change point with the jump size near the boundary of the region of detectability in high dimensional time series models. https://t.co/Vv4cKAMSr6
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 18971
Unqiue Words: 2920

2.011 Mikeys
#7. Rotational Uniqueness Conditions Under Oblique Factor Correlation Metric
Carel F. W. Peeters
In an addendum to his seminal 1969 article J\"{o}reskog stated two sets of conditions for rotational identification of the oblique factor solution under utilization of fixed zero elements in the factor loadings matrix. These condition sets, formulated under factor correlation and factor covariance metrics, respectively, were claimed to be equivalent and to lead to global rotational uniqueness of the factor solution. It is shown here that the conditions for the oblique factor correlation structure need to be amended for global rotational uniqueness, and hence, that the condition sets are not equivalent in terms of unicity of the solution.
more | pdf | html
Figures
None.
Tweets
mathSTb: Carel F.W. Peeters : Rotational Uniqueness Conditions Under Oblique Factor Correlation Metric https://t.co/MajE4aTjXV https://t.co/aTno6mKRxw
StatsPapers: Rotational Uniqueness Conditions Under Oblique Factor Correlation Metric. https://t.co/NGK58sfzp6
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 2417
Unqiue Words: 954

2.006 Mikeys
#8. The Mathematics of Benford's Law -- A Primer
Arno Berger, Theodore P. Hill
This article provides a concise overview of the main mathematical theory of Benford's law in a form accessible to scientists and students who have had first courses in calculus and probability. In particular, one of the main objectives here is to aid researchers who are interested in applying Benford's law, and need to understand general principles clarifying when to expect the appearance of Benford's law in real-life data and when not to expect it. A second main target audience is students of statistics or mathematics, at all levels, who are curious about the mathematics underlying this surprising and robust phenomenon, and may wish to delve more deeply into the subject. This survey of the fundamental principles behind Benford's law includes many basic examples and theorems, but does not include the proofs or the most general statements of the theorems; rather it provides precise references where both may be found.
more | pdf | html
Figures
None.
Tweets
JRBerrendero: Las matemáticas de la ley de Benford https://t.co/fgo3sTS9zT https://t.co/QZxhDe9Ypb
Picanumeros: RT @JRBerrendero: Las matemáticas de la ley de Benford https://t.co/fgo3sTS9zT https://t.co/QZxhDe9Ypb
ETSIIUNED: RT @JRBerrendero: Las matemáticas de la ley de Benford https://t.co/fgo3sTS9zT https://t.co/QZxhDe9Ypb
lriveragalicia: RT @JRBerrendero: Las matemáticas de la ley de Benford https://t.co/fgo3sTS9zT https://t.co/QZxhDe9Ypb
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

1.997 Mikeys
#9. Minimax Confidence Intervals for the Sliced Wasserstein Distance
Tudor Manole, Sivaraman Balakrishnan, Larry Wasserman
The Wasserstein distance has risen in popularity in the statistics and machine learning communities as a useful metric for comparing probability distributions. We study the problem of uncertainty quantification for the Sliced Wasserstein distance--an easily computable approximation of the Wasserstein distance. Specifically, we construct confidence intervals for the Sliced Wasserstein distance which have finite-sample validity under no assumptions or mild moment assumptions, and are adaptive in length to the smoothness of the underlying distributions. We also bound the minimax risk of estimating the Sliced Wasserstein distance, and show that the length of our proposed confidence intervals is minimax optimal over appropriate distribution classes. To motivate the choice of these classes, we also study minimax rates of estimating a distribution under the Sliced Wasserstein distance. These theoretical findings are complemented with a simulation study.
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

1.997 Mikeys
#10. Estimation of Wasserstein distances in the Spiked Transport Model
Jonathan Niles-Weed, Philippe Rigollet
We propose a new statistical model, the spiked transport model, which formalizes the assumption that two probability distributions differ only on a low-dimensional subspace. We study the minimax rate of estimation for the Wasserstein distance under this model and show that this low-dimensional structure can be exploited to avoid the curse of dimensionality. As a byproduct of our minimax analysis, we establish a lower bound showing that, in the absence of such structure, the plug-in estimator is nearly rate-optimal for estimating the Wasserstein distance in high dimension. We also give evidence for a statistical-computational gap and conjecture that any computationally efficient estimator is bound to suffer from the curse of dimensionality.
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 192,914 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 192,914 papers.