Top 10 Arxiv Papers Today in Statistics Theory


2.045 Mikeys
#1. Minimax Rates in Network Analysis: Graphon Estimation, Community Detection and Hypothesis Testing
Chao Gao, Zongming Ma
This paper surveys some recent developments in fundamental limits and optimal algorithms for network analysis. We focus on minimax optimal rates in three fundamental problems of network analysis: graphon estimation, community detection, and hypothesis testing. For each problem, we review state-of-the-art results in the literature followed by general principles behind the optimal procedures that lead to minimax estimation and testing. This allows us to connect problems in network analysis to other statistical inference problems from a general perspective.
more | pdf | html
Figures
None.
Tweets
StatsPapers: Minimax Rates in Network Analysis: Graphon Estimation, Community Detection and Hypothesis Testing. https://t.co/vBSSUYn6kJ
SRoyLee: Minimax Rates in Network Analysis: Graphon Estimation Community Detection and Hypothesis Testing - https://t.co/YvL3c3BDKQ
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 13951
Unqiue Words: 3308

2.035 Mikeys
#2. The autoregression bootstrap for kernel estimates of smooth nonlinear functional time series
Johannes T. N. Krebs, Jürgen E. Franke
Functional times series have become an integral part of both functional data and time series analysis. This paper deals with the functional autoregressive model of order 1 and the autoregression bootstrap for smooth functions. The regression operator is estimated in the framework developed by Ferraty and Vieu [2004] and Ferraty et al. [2007] which is here extended to the double functional case under an assumption of stationary ergodic data which dates back to Laib and Louani [2010]. The main result of this article is the characterization of the asymptotic consistency of the bootstrapped regression operator.
more | pdf | html
Figures
None.
Tweets
StatsPapers: The autoregression bootstrap for kernel estimates of smooth nonlinear functional time series. https://t.co/bcugA2b51W
Priceeqn: RT @StatsPapers: The autoregression bootstrap for kernel estimates of smooth nonlinear functional time series. https://t.co/bcugA2b51W
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 18568
Unqiue Words: 2926

2.013 Mikeys
#3. State-dependent jump activity estimation for Markovian semimartingales
Fabian Mies
The jump behavior of an infinitely active It\^o semimartingale can be conveniently characterized by a jump activity index of Blumenthal-Getoor type, typically assumed to be constant in time. We study Markovian semimartingales with a non-constant, state-dependent jump activity index and a non-vanishing continuous diffusion component. Nonparametric estimators for the functional jump activity index as well as for the drift function are proposed and shown to be asymptotically normal under combined high-frequency and long-time-span asymptotics. The results are based on a novel uniform bound on the Markov generator of the jump diffusion.
more | pdf | html
Figures
Tweets
MathPaper: State-dependent jump activity estimation for Markovian semimartingales. https://t.co/ofTGAY6eA0
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 12271
Unqiue Words: 2469

2.013 Mikeys
#4. On a minimum distance procedure for threshold selection in tail analysis
Holger Drees, Anja Janßen, Sidney I. Resnick, Tiandong Wang
Power-law distributions have been widely observed in different areas of scientific research. Practical estimation issues include how to select a threshold above which observations follow a power-law distribution and then how to estimate the power-law tail index. A minimum distance selection procedure (MDSP) is proposed in Clauset et al. (2009) and has been widely adopted in practice, especially in the analyses of social networks. However, theoretical justifications for this selection procedure remain scant. In this paper, we study the asymptotic behavior of the selected threshold and the corresponding power-law index given by the MDSP. We find that the MDSP tends to choose too high a threshold level and leads to Hill estimates with large variances and root mean squared errors for simulated data with Pareto-like tails.
more | pdf | html
Figures
Tweets
MathPaper: On a minimum distance procedure for threshold selection in tail analysis. https://t.co/y6wRUtSYzS
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 10813
Unqiue Words: 2573

2.012 Mikeys
#5. A Schur transform for spatial stochastic processes
James Mathews
The variance, higher order moments, covariance, and joint moments or cumulants are shown to be special cases of a certain tensor in $V^{\otimes n}$ defined in terms of a collection $X_1,...,X_n$ of $V$-valued random variables, for an appropriate finite-dimensional real vector space $V$. A statistical transform is proposed from such collections--finite spatial stochastic processes--to numerical tuples using the Schur-Weyl decomposition of $V^{\otimes n}$. It is analogous to the Fourier transform, replacing the periodicity group $\mathbb{Z}$, $\mathbb{R}$, or $U(1)$ with the permutation group $S_{n}$. As a test case, we apply the transform to one of the datasets used for benchmarking the Continuous Registration Challenge, the thoracic 4D Computed Tomography (CT) scans from the M.D. Anderson Cancer Center available for download from DIR-Lab. Further applications to morphometry and statistical shape analysis are suggested.
more | pdf | html
Figures
Tweets
StatsPapers: A Schur transform for spatial stochastic processes. https://t.co/xYI0bcdaSl
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 4514
Unqiue Words: 1359

2.004 Mikeys
#6. Towards Characterising Bayesian Network Models under Selection
Angelos P. Armen, Robin J. Evans
Real-life statistical samples are often plagued by selection bias, which complicates drawing conclusions about the general population. When learning causal relationships between the variables is of interest, the sample may be assumed to be from a distribution in a causal Bayesian network (BN) model under selection. Understanding the constraints in the model under selection is the first step towards recovering causal structure in the original model. The conditional-independence (CI) constraints in a BN model under selection have been already characterised; there exist, however, additional, non-CI constraints in such models. In this work, some initial results are provided that simplify the characterisation problem. In addition, an algorithm is designed for identifying compelled ancestors (definite causes) from a completed partially directed acyclic graph (CPDAG). Finally, a non-CI, non-factorisation constraint in a BN model under selection is computed for the first time.
more | pdf | html
Figures
None.
Tweets
StatsPapers: Towards Characterising Bayesian Network Models under Selection. https://t.co/BZ7xIuhYS9
madsyair: RT @StatsPapers: Towards Characterising Bayesian Network Models under Selection. https://t.co/BZ7xIuhYS9
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 10254
Unqiue Words: 1844

2.002 Mikeys
#7. Multiscale change point detection for dependent data
Holger Dette, Theresa Schüler, Mathias Vetter
In this paper we study the theoretical properties of the simultaneous multiscale change point estimator (SMUCE) proposed by Frick et al. (2014) in regression models with dependent error processes. Empirical studies show that in this case the change point estimate is inconsistent, but it is not known if alternatives suggested in the literature for correlated data are consistent. We propose a modification of SMUCE scaling the basic statistic by the long run variance of the error process, which is estimated by a difference-type variance estimator calculated from local means from different blocks. For this modification we prove model consistency for physical dependent error processes and illustrate the finite sample performance by means of a simulation study.
more | pdf | html
Figures
Tweets
MathPaper: Multiscale change point detection for dependent data. https://t.co/8UeZI5BPaf
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 7974
Unqiue Words: 1939

0.0 Mikeys
#8. Composite likelihood estimation for a Gaussian process under fixed domain asymptotics
François Bachoc, Moreno Bevilacqua, Daira Velandia
We study composite likelihood estimation of the covariance parameters with data from a one-dimensional Gaussian process with exponential covariance function under fixed domain asymptotics. We show that the weighted pairwise maximum likelihood estimator of the microergodic parameter can be consistent or inconsistent , depending on the range of admissible parameter values in the likelihood optimization. On the contrary, the weighted pairwise conditional maximum likelihood estimator is always consistent. Both estimators are also asymptotically Gaussian when they are consistent, with asymptotic variance larger or strictly larger than that of the maximum likelihood estimator. A simulation study is presented in order to compare the finite sample behavior of the pairwise likelihood estimators with their asymptotic distributions.
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 9179
Unqiue Words: 2157

0.0 Mikeys
#9. Partial recovery bounds for clustering with the relaxed $K$means
Christophe Giraud, Nicolas Verzelen
We investigate the clustering performances of the relaxed $K$means in the setting of sub-Gaussian Mixture Model (sGMM) and Stochastic Block Model (SBM). After identifying the appropriate signal-to-noise ratio (SNR), we prove that the misclassification error decay exponentially fast with respect to this SNR. These partial recovery bounds for the relaxed $K$means improve upon results currently known in the sGMM setting. In the SBM setting, applying the relaxed $K$means SDP allows to handle general connection probabilities whereas other SDPs investigated in the literature are restricted to the assortative case (where within group probabilities are larger than between group probabilities). Again, this partial recovery bound complements the state-of-the-art results. All together, these results put forward the versatility of the relaxed $K$means.
more | pdf | html
Figures
None.
Tweets
MarcosMatabuena: RT @StatsPapers: Partial recovery bounds for clustering with the relaxed $K$means. https://t.co/RpHnoW2I3w
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 18017
Unqiue Words: 3274

0.0 Mikeys
#10. Bootstrapping Max Statistics in High Dimensions: Near-Parametric Rates Under Weak Variance Decay and Application to Functional Data Analysis
Miles E. Lopes, Zhenhua Lin, Hans-Georg Mueller
In recent years, bootstrap methods have drawn attention for their ability to approximate the laws of "max statistics" in high-dimensional problems. A leading example of such a statistic is the coordinate-wise maximum of a sample average of $n$ random vectors in $\mathbb{R}^p$. Existing results for this statistic show that the bootstrap can work when $n\ll p$, and rates of approximation (in Kolmogorov distance) have been obtained with only logarithmic dependence in $p$. Nevertheless, one of the challenging aspects of this setting is that established rates tend to scale like $n^{-1/6}$ as a function of $n$. The main purpose of this paper is to demonstrate that improvement in rate is possible when extra model structure is available. Specifically, we show that if the coordinate-wise variances of the observations exhibit decay, then a nearly $n^{-1/2}$ rate can be achieved, independent of $p$. Furthermore, a surprising aspect of this dimension-free rate is that it holds even when the decay is very weak. As a numerical illustration,...
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 15790
Unqiue Words: 3312

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 57,756 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 57,756 papers.