Top 10 Arxiv Papers Today in Statistics


0.0 Mikeys
#1. Bayesian Inference for Big Spatial Data Using Non-stationary Spectral Simulation
Hou-Cheng Yang, Jonathan R. Bradley
It is increasingly understood that the assumption of stationarity is unrealistic for many spatial processes. In this article, we combine dimension expansion with a spectral method to model big non-stationary spatial fields in a computationally efficient manner. Specifically, we use Mejia and Rodriguez-Iturbe (1974)'s spectral simulation approach to simulate a spatial process with a covariogram at locations that have an expanded dimension. We introduce Bayesian hierarchical modelling to dimension expansion, which originally has only been modeled using a method of moments approach. In particular, we simulate from the posterior distribution using a collapsed Gibbs sampler. Our method is both full rank and non-stationary, and can be applied to big spatial data because it does not involve storing and inverting large covariance matrices. Additionally, we have fewer parameters than many other non-stationary spatial models. We demonstrate the wide applicability of our approach using a simulation study, and an application using ozone data...
more | pdf | html
Figures
Tweets
StatsPapers: Bayesian Inference for Big Spatial Data Using Non-stationary Spectral Simulation. https://t.co/hoTHURUn8a
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 8941
Unqiue Words: 2224

0.0 Mikeys
#2. Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives
Antoine Dedieu, Hussein Hazimeh, Rahul Mazumder
We consider a discrete optimization based approach for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features. Recent work has shown that mixed integer programming (MIP) can be used to solve (to optimality) $\ell_0$-regularized problems at scales much larger than what was conventionally considered possible in the statistics and machine learning communities. Despite their usefulness, MIP-based approaches are significantly slower compared to relatively mature algorithms based on $\ell_1$-regularization and relatives. We aim to bridge this computational gap by developing new MIP-based algorithms for $\ell_0$-regularized classification. We propose two classes of scalable algorithms: an exact algorithm that can handle $p\approx 50,000$ features in a few minutes, and approximate algorithms that can address instances with $p\approx 10^6$ in times comparable to fast $\ell_1$-based algorithms. Our exact algorithm is based on the novel idea of \textsl{integrality generation}, which...
more | pdf | html
Figures
Tweets
arxivml: "Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives", Antoine Dedieu, Hussein Hazi… https://t.co/oTzJ7juP18
arxiv_cs_LG: Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives. Antoine Dedieu, Hussein Hazimeh, and Rahul Mazumder https://t.co/1tjrVwKTsP
Memoirs: Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives. https://t.co/U8HRAowRa4
Github

Efficient Algorithms for L0 Regularized Learning

Repository: L0Learn
User: hazimehh
Language: C++
Stargazers: 48
Subscribers: 8
Forks: 12
Open Issues: 2
Youtube
None.
Other stats
Sample Sizes : [805, 140, 420]
Authors: 3
Total Words: 16705
Unqiue Words: 3570

0.0 Mikeys
#3. Neglecting Uncertainties Leads to Suboptimal Decisions About Home-Owners Flood Risk Management
Mahkameh Zarekarizi, Vivek Srikrishnan, Klaus Keller
Homeowners around the world elevate houses to manage flood risks. Deciding how high to elevate the house poses a nontrivial decision problem. The U.S. Federal Emergency Management Agency (FEMA) recommends elevating a house to the Base Flood Elevation (the elevation of the 100-yr flood) plus a freeboard. This recommendation neglects many uncertainties. Here we use a multi-objective robust decision-making framework to analyze this decision in the face of deep uncertainties. We find strong interactions between the economic, engineering, and Earth science uncertainties, illustrating the need for an integrated analysis. We show that considering deep uncertainties surrounding flood hazards, the discount rate, the house lifetime, and the fragility increases the economically optimal house elevation to values well above the recommendation by FEMA. An improved decision-support for home-owners has the potential to drastically improve decisions and outcomes.
more | pdf | html
Figures
None.
Tweets
StatsPapers: Neglecting Uncertainties Leads to Suboptimal Decisions About Home-Owners Flood Risk Management. https://t.co/fhyUhO98h7
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#4. Coarsened mixtures of hierarchical skew normal kernels for flow cytometry analyses
Shai Gorsky, Cliburn Chan, Li Ma
Flow cytometry (FCM) is the standard multi-parameter assay used to measure single cell phenotype and functionality. It is commonly used to quantify the relative frequencies of cell subsets in blood and disaggregated tissues. A typical analysis of FCM data involves cell classification - the identification of cell subgroups in the sample - and comparisons of the cell subgroups across samples. While modern experiments often necessitate the collection and processing of samples in multiple batches, analysis of FCM data across batches is challenging because the locations in the marker space of cell subsets may vary across samples. Differences across samples may occur because of true biological variation or technical reasons such as antibody lot effects or instrument optics. An important step in comparative analyses of multi-sample FCM data is cross-sample calibration, whose goal is to align cell subsets across multiple samples in the presence of variations in locations, so that variation due to technical reasons is minimized and true...
more | pdf | html
Figures
None.
Tweets
StatsPapers: Coarsened mixtures of hierarchical skew normal kernels for flow cytometry analyses. https://t.co/BtaL0CGgFV
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#5. Bayesian inference of dynamics from partial and noisy observations using data assimilation and machine learning
Marc Bocquet, Julien Brajard, Alberto Carrassi, Laurent Bertino
The reconstruction from observations of high-dimensional chaotic dynamics such as geophysical flows is hampered by (i) the partial and noisy observations that can realistically be obtained, (ii) the need to learn from long time series of data, and (iii) the unstable nature of the dynamics. To achieve such inference from the observations over long time series, it has been suggested to combine data assimilation and machine learning in several ways. We show how to unify these approaches from a Bayesian perspective using expectation-maximization and coordinate descents. Implementations and approximations of these methods are also discussed. Finally, we numerically and successfully test the approach on two relevant low-order chaotic models with distinct identifiability.
more | pdf | html
Figures
None.
Tweets
BrundageBot: Bayesian inference of dynamics from partial and noisy observations using data assimilation and machine learning. Marc Bocquet, Julien Brajard, Alberto Carrassi, and Laurent Bertino https://t.co/hxt3fIcxoO
arxiv_cs_LG: Bayesian inference of dynamics from partial and noisy observations using data assimilation and machine learning. Marc Bocquet, Julien Brajard, Alberto Carrassi, and Laurent Bertino https://t.co/5lgitwBrp3
Memoirs: Bayesian inference of dynamics from partial and noisy observations using data assimilation and machine learning. https://t.co/9TnQlsDbZI
arxivml: "Bayesian inference of dynamics from partial and noisy observations using data assimilation and machine learning", … https://t.co/5sVdHpJj8r
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#6. Multifidelity Approximate Bayesian Computation with Sequential Monte Carlo Parameter Sampling
Thomas P. Prescott, Ruth E. Baker
Multifidelity approximate Bayesian computation (MF-ABC) is a likelihood-free technique for parameter inference that exploits model approximations to significantly increase the speed of ABC algorithms (Prescott and Baker, 2020). Previous work has considered MF-ABC only in the context of rejection sampling, which does not explore parameter space particularly efficiently. In this work, we integrate the multifidelity approach with the ABC sequential Monte Carlo (ABC-SMC) algorithm into a new MF-ABC-SMC algorithm. We show that the improvements generated by each of ABC-SMC and MF-ABC to the efficiency of generating Monte Carlo samples and estimates from the ABC posterior are amplified when the two techniques are used together.
more | pdf | html
Figures
Tweets
ABC_Research: Prescott and Baker develop Multifidelity Approximate Bayesian Computation with Sequential Monte Carlo Parameter Sampling https://t.co/ZZFVpi92Sx @ruth_baker
StatsPapers: Multifidelity Approximate Bayesian Computation with Sequential Monte Carlo Parameter Sampling. https://t.co/tGqEV5SI4n
Github

Multifidelity Approximate Bayesian Computation with Sequential Monte Carlo Parameter Sampling

Repository: mf-abc-smc
User: tpprescott
Language: Julia
Stargazers: 0
Subscribers: 1
Forks: 0
Open Issues: 0
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 16769
Unqiue Words: 2921

0.0 Mikeys
#7. Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings
Christian P. Robert, Wu Changye
In this chapter, we review some of the most standard MCMC tools used in Bayesian computation, along with vignettes on standard misunderstandings of these approaches taken from Q \&~A's on the forum Cross-validated answered by the first author.
more | pdf | html
Figures
None.
Tweets
JRBerrendero: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/YzFUuuhpCL https://t.co/r3R0ugRgzt
tmasada: [2001.06249] Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings https://t.co/X2ZCfkrcIw
StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
JAdP: RT @StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
leeswijzer: RT @StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
ballforest: RT @StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
RexDouglass: RT @StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
xiangze750: RT @StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
masatokun_markn: RT @StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
dizzy_my_future: RT @StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
PratheepaJ: RT @StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
ustazz: RT @StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
MishakinSergey: RT @StatsPapers: Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings. https://t.co/wKVaIUcHwj
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#8. Causal models for dynamical systems
Jonas Peters, Stefan Bauer, Niklas Pfister
A probabilistic model describes a system in its observational state. In many situations, however, we are interested in the system's response under interventions. The class of structural causal models provides a language that allows us to model the behaviour under interventions. It can been taken as a starting point to answer a plethora of causal questions, including the identification of causal effects or causal structure learning. In this chapter, we provide a natural and straight-forward extension of this concept to dynamical systems, focusing on continuous time models. In particular, we introduce two types of causal kinetic models that differ in how the randomness enters into the model: it may either be considered as observational noise or as systematic driving noise. In both cases, we define interventions and therefore provide a possible starting point for causal inference. In this sense, the book chapter provides more questions than answers. The focus of the proposed causal kinetic models lies on the dynamics themselves...
more | pdf | html
Figures
None.
Tweets
StatsPapers: Causal models for dynamical systems. https://t.co/Wz2hgxnSFo
dizzy_my_future: RT @StatsPapers: Causal models for dynamical systems. https://t.co/Wz2hgxnSFo
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#9. Communication-Efficient Distributed Estimator for Generalized Linear Models with a Diverging Number of Covariates
Ping Zhou, Zhen Yu, Jingyi Ma, Maozai Tian
Distributed statistical inference has recently attracted immense attention. Herein, we study the asymptotic efficiency of the maximum likelihood estimator (MLE), the one-step MLE, and the aggregated estimating equation estimator for generalized linear models with a diverging number of covariates. Then a novel method is proposed to obtain an asymptotically efficient estimator for large-scale distributed data by two rounds of communication between local machines and the central server. The assumption on the number of machines in this paper is more relaxed and thus practical for real-world applications. Simulations and a case study demonstrate the satisfactory finite-sample performance of the proposed estimators.
more | pdf | html
Figures
None.
Tweets
arxiv_cs_LG: Communication-Efficient Distributed Estimator for Generalized Linear Models with a Diverging Number of Covariates. Ping Zhou, Zhen Yu, Jingyi Ma, and Maozai Tian https://t.co/kfD1qsdesf
Memoirs: Communication-Efficient Distributed Estimator for Generalized Linear Models with a Diverging Number of Covariates. https://t.co/h5lbxmbUVa
arxivml: "Communication-Efficient Distributed Estimator for Generalized Linear Models with a Diverging Number of Covariates"… https://t.co/f6AIORHsIi
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 0
Unqiue Words: 0

0.0 Mikeys
#10. Optimal Crossover Designs for Generalized Linear Models
Jeevan Jankar, Abhyuday Mandal, Jie Yang
We identify locally $D$-optimal crossover designs for generalized linear models. We use generalized estimating equations to estimate the model parameters along with their variances. To capture the dependency among the observations coming from the same subject, we propose six different correlation structures. We identify the optimal allocations of units for different sequences of treatments. For two-treatment crossover designs, we show via simulations that the optimal allocations are reasonably robust to different choices of the correlation structures. We discuss a real example of multiple treatment crossover experiments using Latin square designs. Using a simulation study, we show that a two-stage design with our locally $D$-optimal design at the second stage is more efficient than the uniform design, especially when the responses from the same subject are correlated.
more | pdf | html
Figures
None.
Tweets
StatsPapers: Optimal Crossover Designs for Generalized Linear Models. https://t.co/UCVTsCv3FO
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 255,445 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 255,445 papers.