Top 10 Arxiv Papers Today in Statistics


2.13 Mikeys
#1. Shapley Interpretation and Activation in Neural Networks
Yadong Li, Xin Cui
We propose a novel Shapley value approach to help address neural networks' interpretability and "vanishing gradient" problems. Our method is based on an accurate analytical approximation to the Shapley value of a neuron with ReLU activation. This analytical approximation admits a linear propagation of relevance across neural network layers, resulting in a simple, fast and sensible interpretation of neural networks' decision making process. We then derived a globally continuous and non-vanishing Shapley gradient, which can replace the conventional gradient in training neural network layers with ReLU activation, and leading to better training performance. We further derived a Shapley Activation (SA) function, which is a close approximation to ReLU but features the Shapley gradient. The SA is easy to implement in existing machine learning frameworks. Numerical tests show that SA consistently outperforms ReLU in training convergence, accuracy and stability.
more | pdf | html
Figures
None.
Tweets
BrundageBot: Shapley Interpretation and Activation in Neural Networks. Yadong Li and Xin Cui https://t.co/tgpgdQnjBs
arxivml: "Shapley Interpretation and Activation in Neural Networks", Yadong Li, Xin Cui https://t.co/nHzKMilJad
arxiv_cs_LG: Shapley Interpretation and Activation in Neural Networks. Yadong Li and Xin Cui https://t.co/hTQRBdEJX6
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

2.13 Mikeys
#2. A Knowledge Transfer Framework for Differentially Private Sparse Learning
Lingxiao Wang, Quanquan Gu
We study the problem of estimating high dimensional models with underlying sparse structures while preserving the privacy of each training example. We develop a differentially private high-dimensional sparse learning framework using the idea of knowledge transfer. More specifically, we propose to distill the knowledge from a "teacher" estimator trained on a private dataset, by creating a new dataset from auxiliary features, and then train a differentially private "student" estimator using this new dataset. In addition, we establish the linear convergence rate as well as the utility guarantee for our proposed method. For sparse linear regression and sparse logistic regression, our method achieves improved utility guarantees compared with the best known results (Kifer et al., 2012; Wang and Gu, 2019). We further demonstrate the superiority of our framework through both synthetic and real-world data experiments.
more | pdf | html
Figures
None.
Tweets
BrundageBot: A Knowledge Transfer Framework for Differentially Private Sparse Learning. Lingxiao Wang and Quanquan Gu https://t.co/gdrzK4SuzH
arxivml: "A Knowledge Transfer Framework for Differentially Private Sparse Learning", Lingxiao Wang, Quanquan Gu https://t.co/ZtFdd2Q5J8
arxiv_cs_LG: A Knowledge Transfer Framework for Differentially Private Sparse Learning. Lingxiao Wang and Quanquan Gu https://t.co/NWDgGd165m
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

2.017 Mikeys
#3. Active learning for level set estimation under cost-dependent input uncertainty
Yu Inatsu, Masayuki Karasuyama, Keiichi Inoue, Ichiro Takeuchi
As part of a quality control process in manufacturing it is often necessary to test whether all parts of a product satisfy a required property, with as few inspections as possible. When multiple inspection apparatuses with different costs and precision exist, it is desirable that testing can be carried out cost-effectively by properly controlling the trade-off between the costs and the precision. In this paper, we formulate this as a level set estimation (LSE) problem under cost-dependent input uncertainty - LSE being a type of active learning for estimating the level set, i.e., the subset of the input space in which an unknown function value is greater or smaller than a pre-determined threshold. Then, we propose a new algorithm for LSE under cost-dependent input uncertainty with theoretical convergence guarantee. We demonstrate the effectiveness of the proposed algorithm by applying it to synthetic and real datasets.
more | pdf | html
Figures
None.
Tweets
arxiv_cs_LG: Active learning for level set estimation under cost-dependent input uncertainty. Yu Inatsu, Masayuki Karasuyama, Keiichi Inoue, and Ichiro Takeuchi https://t.co/I0kCWBMSoO
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 0
Unqiue Words: 0

2.017 Mikeys
#4. Shallow Self-Learning for Reject Inference in Credit Scoring
Nikita Kozodoi, Panagiotis Katsas, Stefan Lessmann, Luis Moreira-Matias, Konstantinos Papakonstantinou
Credit scoring models support loan approval decisions in the financial services industry. Lenders train these models on data from previously granted credit applications, where the borrowers' repayment behavior has been observed. This approach creates sample bias. The scoring model (i.e., classifier) is trained on accepted cases only. Applying the resulting model to screen credit applications from the population of all borrowers degrades model performance. Reject inference comprises techniques to overcome sampling bias through assigning labels to rejected cases. The paper makes two contributions. First, we propose a self-learning framework for reject inference. The framework is geared toward real-world credit scoring requirements through considering distinct training regimes for iterative labeling and model training. Second, we introduce a new measure to assess the effectiveness of reject inference strategies. Our measure leverages domain knowledge to avoid artificial labeling of rejected cases during strategy evaluation. We...
more | pdf | html
Figures
None.
Tweets
arxiv_cs_LG: Shallow Self-Learning for Reject Inference in Credit Scoring. Nikita Kozodoi, Panagiotis Katsas, Stefan Lessmann, Luis Moreira-Matias, and Konstantinos Papakonstantinou https://t.co/4cYjvadCpt
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 0
Unqiue Words: 0

2.017 Mikeys
#5. d-blink: Distributed End-to-End Bayesian Entity Resolution
Neil G. Marchant, Rebecca C. Steorts, Andee Kaplan, Benjamin I. P. Rubinstein, Daniel N. Elazar
Entity resolution (ER) (record linkage or de-duplication) is the process of merging together noisy databases, often in the absence of a unique identifier. A major advancement in ER methodology has been the application of Bayesian generative models. Such models provide a natural framework for clustering records to unobserved (latent) entities, while providing exact uncertainty quantification and tight performance bounds. Despite these advancements, existing models do not scale to realistically-sized databases (larger than 1000 records) and they do not incorporate probabilistic blocking. In this paper, we propose "distributed Bayesian linkage" or d-blink -- the first scalable and distributed end-to-end Bayesian model for ER, which propagates uncertainty in blocking, matching and merging. We make several novel contributions, including: (i) incorporating probabilistic blocking directly into the model through auxiliary partitions; (ii) support for missing values; (iii) a partially-collapsed Gibbs sampler; and (iv) a novel perturbation...
more | pdf | html
Figures
None.
Tweets
arxiv_cs_LG: d-blink: Distributed End-to-End Bayesian Entity Resolution. Neil G. Marchant, Rebecca C. Steorts, Andee Kaplan, Benjamin I. P. Rubinstein, and Daniel N. Elazar https://t.co/JKFMGcFG7z
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 0
Unqiue Words: 0

1.968 Mikeys
#6. Estimating Fisher Information Matrix in Latent Variable Models based on the Score Function
Maud Delattre, Estelle Kuhn
The Fisher information matrix (FIM) is a key quantity in statistics as it is required for example for evaluating asymptotic precisions of parameter estimates, for computing test statistics or asymptotic distributions in statistical testing, for evaluating post model selection inference results or optimality criteria in experimental designs. However its exact computation is often not trivial. In particular in many latent variable models, it is intricated due to the presence of unobserved variables. Therefore the observed FIM is usually considered in this context to estimate the FIM. Several methods have been proposed to approximate the observed FIM when it can not be evaluated analytically. Among the most frequently used approaches are Monte-Carlo methods or iterative algorithms derived from the missing information principle. All these methods require to compute second derivatives of the complete data log-likelihood which leads to some disadvantages from a computational point of view. In this paper, we present a new approach to...
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

1.968 Mikeys
#7. Monte Carlo Approximation of Bayes Factors via Mixing with Surrogate Distributions
Chenguang Dai, Jun S. Liu
By mixing the posterior distribution with a surrogate distribution, of which the normalizing constant is tractable, we describe a new method to estimate the normalizing constant using the Wang-Landau algorithm. We then introduce an accelerated version of the proposed method using the momentum technique. In addition, several extensions are discussed, including (1) a parallel variant, which inserts a sequence of intermediate distributions between the posterior distribution and the surrogate distribution, to further improve the efficiency of the proposed method; (2) the use of the surrogate distribution to help detect potential multimodality of the posterior distribution, upon which a better sampler can be designed utilizing mode jumping algorithms; (3) a new jumping mechanism for general reversible jump Markov chain Monte Carlo algorithms that combines the Multiple-try Metropolis and the directional sampling algorithm, which can be used to estimate the normalizing constant when a surrogate distribution is difficult to come by. We...
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

1.968 Mikeys
#8. Generalized Records for Functional Time Series with Application to Unit Root Tests
Israel Martínez-Hernández, Marc G. Genton
A generalization of the definition of records to functional data is proposed. The definition is based on ranking curves using a notion of functional depth. This approach allows us to study the curves of the number of records over time. We focus on functional time series and apply ideas from univariate time series to demonstrate the asymptotic distribution describing the number of records. A unit root test is proposed as an application of functional record theory. Through a Monte Carlo study, different scenarios of functional processes are simulated to evaluate the performance of the unit root test. The generalized record definition is applied on two different datasets: Annual mortality rates in France and daily curves of wind speed at Yanbu, Saudi Arabia. The record curves are identified and the underlying functional process is studied based on the number of record curves observed.
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : [100, 1000, 10000]
Authors: 2
Total Words: 13108
Unqiue Words: 2709

1.968 Mikeys
#9. MACE: Multiscale Abrupt Change Estimation Under Complex Temporal Dynamics
Weichi Wu, Zhou Zhou
We consider the problem of detecting abrupt changes in an otherwise smoothly evolving trend whilst the covariance and higher-order structures of the system can experience both smooth and abrupt changes over time. The number of abrupt change points is allowed to diverge to infinity with the jump sizes possibly shrinking to zero. The method is based on a multiscale application of an optimal jump-pass filter to the time series, where the scales are dense between admissible lower and upper bounds. The MACE method is shown to be able to detect all abrupt change points within a nearly optimal range with a prescribed probability asymptotically. For a time series of length $n$, the computational complexity of MACE is $O(n)$ for each scale and $O(n\log^{1+\epsilon} n)$ overall, where $\epsilon$ is an arbitrarily small positive constant. Simulations and data analysis show that, under complex temporal dynamics, MACE performs favourably compared with some of the state-of-the-art multiscale change point detection methods.
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

1.968 Mikeys
#10. A Double Penalty Model for Interpretability
Wenjia Wang, Yi-Hui Zhou
Modern statistical learning techniques have often emphasized prediction performance over interpretability, giving rise to "black box" models that may be difficult to understand, and to generalize to other settings. We conceptually divide a prediction model into interpretable and non-interpretable portions, as a means to produce models that are highly interpretable with little loss in performance. Implementation of the model is achieved by considering separability of the interpretable and non-interpretable portions, along with a doubly penalized procedure for model fitting. We specify conditions under which convergence of model estimation can be achieved via cyclic coordinate ascent, and the consistency of model estimation holds. We apply the methods to datasets for microbiome host trait prediction and a diabetes trait, and discuss practical tradeoff diagnostics to select models with high interpretability.
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 189,566 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 189,566 papers.