Top 10 Arxiv Papers Today in Statistics


2.032 Mikeys
#1. Individualized Multilayer Tensor Learning with An Application in Imaging Analysis
Xiwei Tang, Xuan Bi, Annie Qu
This work is motivated by multimodality breast cancer imaging data, which is quite challenging in that the signals of discrete tumor-associated microvesicles (TMVs) are randomly distributed with heterogeneous patterns. This imposes a significant challenge for conventional imaging regression and dimension reduction models assuming a homogeneous feature structure. We develop an innovative multilayer tensor learning method to incorporate heterogeneity to a higher-order tensor decomposition and predict disease status effectively through utilizing subject-wise imaging features and multimodality information. Specifically, we construct a multilayer decomposition which leverages an individualized imaging layer in addition to a modality-specific tensor structure. One major advantage of our approach is that we are able to efficiently capture the heterogeneous spatial features of signals that are not characterized by a population structure as well as integrating multimodality information simultaneously. To achieve scalable computing, we...
more | pdf | html
Figures
Tweets
arxivml: "Individualized Multilayer Tensor Learning with An Application in Imaging Analysis", Xiwei Tang, Xuan Bi, Annie Qu https://t.co/cezGnBW5Wz
arxiv_cs_LG: Individualized Multilayer Tensor Learning with An Application in Imaging Analysis. Xiwei Tang, Xuan Bi, and Annie Qu https://t.co/hXblZU0pvt
Memoirs: Individualized Multilayer Tensor Learning with An Application in Imaging Analysis. https://t.co/TX9LujwsfV
arxiv_cscv: Individualized Multilayer Tensor Learning with An Application in Imaging Analysis https://t.co/Loi5OtkXS9
arxiv_cscv: Individualized Multilayer Tensor Learning with An Application in Imaging Analysis https://t.co/Loi5OtkXS9
arxiv_cscv: Individualized Multilayer Tensor Learning with An Application in Imaging Analysis https://t.co/Loi5OtkXS9
arxiv_cscv: Individualized Multilayer Tensor Learning with An Application in Imaging Analysis https://t.co/Loi5Ot3mtz
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 12528
Unqiue Words: 3051

2.021 Mikeys
#2. Stochastic Optimization of Sorting Networks via Continuous Relaxations
Aditya Grover, Eric Wang, Aaron Zweig, Stefano Ermon
Sorting input objects is an important step in many machine learning pipelines. However, the sorting operator is non-differentiable with respect to its inputs, which prohibits end-to-end gradient-based optimization. In this work, we propose NeuralSort, a general-purpose continuous relaxation of the output of the sorting operator from permutation matrices to the set of unimodal row-stochastic matrices, where every row sums to one and has a distinct arg max. This relaxation permits straight-through optimization of any computational graph involve a sorting operation. Further, we use this relaxation to enable gradient-based stochastic optimization over the combinatorially large space of permutations by deriving a reparameterized gradient estimator for the Plackett-Luce family of distributions over permutations. We demonstrate the usefulness of our framework on three tasks that require learning semantic orderings of high-dimensional objects, including a fully differentiable, parameterized extension of the k-nearest neighbors algorithm.
more | pdf | html
Figures
Tweets
arxivml: "Stochastic Optimization of Sorting Networks via Continuous Relaxations", Aditya Grover, Eric Wang, Aaron Zweig, St… https://t.co/jnPHYdQQZx
arxiv_cs_LG: Stochastic Optimization of Sorting Networks via Continuous Relaxations. Aditya Grover, Eric Wang, Aaron Zweig, and Stefano Ermon https://t.co/9NEWsnU8v2
Soul: Stochastic Optimization of Sorting Networks via Continuous Relaxations. https://t.co/aEWwxlFhCW
Github

Code for "Stochastic Optimization of Sorting Networks using Continuous Relaxations", ICLR 2019.

Repository: neuralsort
User: ermongroup
Language: Python
Stargazers: 1
Subscribers: 7
Forks: 0
Open Issues: 0
Youtube
None.
Other stats
Sample Sizes : [3, 5, 7, 9, 5, 9]
Authors: 4
Total Words: 11264
Unqiue Words: 2980

2.019 Mikeys
#3. Statistical Methods for Replicability Assessment
Kenneth Hung, William Fithian
Large-scale replication studies like the Reproducibility Project: Psychology (RP:P) provide invaluable systematic data on scientific replicability, but most analyses and interpretations of the data fail to agree on the definition of "replicability" and disentangle the inexorable consequences of known selection bias from competing explanations. We discuss three concrete definitions of replicability based on (1) whether published findings about the signs of effects are mostly correct, (2) how effective replication studies are in reproducing whatever true effect size was present in the original experiment, and (3) whether true effect sizes tend to diminish in replication. We apply techniques from multiple testing and post-selection inference to develop new methods that answer these questions while explicitly accounting for selection bias. Re-analyzing the RP:P data, we estimate that 22 out of 68 (32%) original directional claims were false (upper confidence bound 47%); by comparison, we estimate that among claims significant at the...
more | pdf | html
Figures
None.
Tweets
arxiv_org: Statistical Methods for Replicability Assessment. https://t.co/c9JIVn3eLM https://t.co/l00bkGB58s
wfithian: We explain our new methods and use them to re-analyze the RP:P data in our paper: https://t.co/YormrVlX5m
StatsPapers: Statistical Methods for Replicability Assessment. https://t.co/6AWFJKuLGG
kenhungkk: A new paper from @wfithian and me on replicability from a statistical perspective! We provide new metrics, new methods to estimate these metrics and applied them to the Reproducibility Project: Psychology data! https://t.co/DH56XWT7cA
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

2.017 Mikeys
#4. Modelling Diffusion through Statistical Network Analysis: A Simulation Study
Johan A. Elkink, Thomas U. Grund
The study of international relations by definition deals with interdependencies among countries. One form of interdependence between countries is the diffusion of country-level features, such as policies, political regimes, or conflict. In these studies, the outcome variable tends to be categorical, and the primary concern is the clustering of the outcome variable among connected countries. Statistically, such clustering is studied with spatial econometric models. This paper instead proposes the use of a statistical network approach to model diffusion with a binary outcome variable. Using statistical network instead of spatial econometric models allows for a more natural specification of the diffusion process, assuming autocorrelation in the outcomes rather than the corresponding latent variable, and it simplifies the inclusion of temporal dynamics, higher level interdependencies and interactions between network ties and country-level features. In our simulations, the performance of the Stochastic Actor-Oriented Model...
more | pdf | html
Figures
Tweets
arxiv_org: Modelling Diffusion through Statistical Network Analysis: A Simulation Study. https://t.co/yQPeKIaoJl https://t.co/Ogb8nm6EiV
jelkink: Working paper with @thomasgrundUCD on using Stochastic Actor-Oriented Models for (static) spatial diffusion analysis now online at https://t.co/zf6yn2NHPr @ucdpolitics @ucdsociology
StatsPapers: Modelling Diffusion through Statistical Network Analysis: A Simulation Study. https://t.co/6Q0PTXYkkN
BrianKrent: RT @arxiv_org: Modelling Diffusion through Statistical Network Analysis: A Simulation Study. https://t.co/yQPeKIaoJl https://t.co/Ogb8nm6EiV
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 11344
Unqiue Words: 3189

2.013 Mikeys
#5. Transferability of Operational Status Classification Models Among Different Wind Turbine Typesq
Z. Trstanova, A. Martinsson, C. Matthews, S. Jimenez, B. Leimkuhler, T. Van Delft, M. Wilkinson
A detailed understanding of wind turbine performance status classification can improve operations and maintenance in the wind energy industry. Due to different engineering properties of wind turbines, the standard supervised learning models used for classification do not generalize across data sets obtained from different wind sites. We propose two methods to deal with the transferability of the trained models: first, data normalization in the form of power curve alignment, and second, a robust method based on convolutional neural networks and feature-space extension. We demonstrate the success of our methods on real-world data sets with industrial applications.
more | pdf | html
Figures
Tweets
arxivml: "Transferability of Operational Status Classification Models Among Different Wind Turbine Typesq", Z. Trstanova, A.… https://t.co/34jksd60M5
arxiv_cs_LG: Transferability of Operational Status Classification Models Among Different Wind Turbine Typesq. Z. Trstanova, A. Martinsson, C. Matthews, S. Jimenez, B. Leimkuhler, T. Van Delft, and M. Wilkinson https://t.co/vA4pMWY0Ht
Memoirs: Transferability of Operational Status Classification Models Among Different Wind Turbine Typesq. https://t.co/ZVPwPEvBNd
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 7
Total Words: 3740
Unqiue Words: 1337

2.013 Mikeys
#6. Hydra: A method for strain-minimizing hyperbolic embedding
Martin Keller-Ressel, Stephanie Nargang
We introduce hydra (hyperbolic distance recovery and approximation), a new method for embedding network- or distance-based data into hyperbolic space. We show mathematically that hydra satisfies a certain optimality guarantee: It minimizes the 'hyperbolic strain' between original and embedded data points. Moreover, it recovers points exactly, when they are located on a hyperbolic submanifold of the feature space. Testing on real network data we show that hydra typically outperforms existing hyperbolic embedding methods in terms of embedding quality.
more | pdf | html
Figures
Tweets
arxivml: "Hydra: A method for strain-minimizing hyperbolic embedding", Martin Keller-Ressel, Stephanie Nargang https://t.co/uv8L5D00rU
arxiv_cs_LG: Hydra: A method for strain-minimizing hyperbolic embedding. Martin Keller-Ressel and Stephanie Nargang https://t.co/ZX5ekrdBH0
Memoirs: Hydra: A method for strain-minimizing hyperbolic embedding. https://t.co/MlQ6UB5lzw
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 6377
Unqiue Words: 1960

2.013 Mikeys
#7. Exact slice sampler for Hierarchical Dirichlet Processes
Arash A. Amini, Marina Paez, Lizhen Lin, Zahra S. Razaee
We propose an exact slice sampler for Hierarchical Dirichlet process (HDP) and its associated mixture models (Teh et al., 2006). Although there are existing MCMC algorithms for sampling from the HDP, a slice sampler has been missing from the literature. Slice sampling is well-known for its desirable properties including its fast mixing and its natural potential for parallelization. On the other hand, the hierarchical nature of HDPs poses challenges to adopting a full-fledged slice sampler that automatically truncates all the infinite measures involved without ad-hoc modifications. In this work, we adopt the powerful idea of Bayesian variable augmentation to address this challenge. By introducing new latent variables, we obtain a full factorization of the joint distribution that is suitable for slice sampling. Our algorithm has several appealing features such as (1) fast mixing; (2) remaining exact while allowing natural truncation of the underlying infinite-dimensional measures, as in (Kalli et al., 2011), resulting in updates of...
more | pdf | html
Figures
None.
Tweets
arxivml: "Exact slice sampler for Hierarchical Dirichlet Processes", Arash A. Amini, Marina Paez, Lizhen Lin, Zahra S. Razaee https://t.co/WSmsflB95K
arxiv_cs_LG: Exact slice sampler for Hierarchical Dirichlet Processes. Arash A. Amini, Marina Paez, Lizhen Lin, and Zahra S. Razaee https://t.co/VxCaA90eGI
Memoirs: Exact slice sampler for Hierarchical Dirichlet Processes. https://t.co/Z5SeZ5wIrM
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 5470
Unqiue Words: 1581

2.013 Mikeys
#8. Prescriptive Cluster-Dependent Support Vector Machines with an Application to Reducing Hospital Readmissions
Taiyao Wang, Ioannis Ch. Paschalidis
We augment linear Support Vector Machine (SVM) classifiers by adding three important features: (i) we introduce a regularization constraint to induce a sparse classifier; (ii) we devise a method that partitions the positive class into clusters and selects a sparse SVM classifier for each cluster; and (iii) we develop a method to optimize the values of controllable variables in order to reduce the number of data points which are predicted to have an undesirable outcome, which, in our setting, coincides with being in the positive class. The latter feature leads to personalized prescriptions/recommendations. We apply our methods to the problem of predicting and preventing hospital readmissions within 30-days from discharge for patients that underwent a general surgical procedure. To that end, we leverage a large dataset containing over 2.28 million patients who had surgeries in the period 2011--2014 in the U.S. The dataset has been collected as part of the American College of Surgeons National Surgical Quality Improvement Program (NSQIP).
more | pdf | html
Figures
Tweets
arxivml: "Prescriptive Cluster-Dependent Support Vector Machines with an Application to Reducing Hospital Readmissions", Tai… https://t.co/qtDNwoPRD7
arxiv_cs_LG: Prescriptive Cluster-Dependent Support Vector Machines with an Application to Reducing Hospital Readmissions. Taiyao Wang and Ioannis Ch. Paschalidis https://t.co/bnCIlev7Qm
Memoirs: Prescriptive Cluster-Dependent Support Vector Machines with an Application to Reducing Hospital Readmissions. https://t.co/19tU3dPkMc
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 5373
Unqiue Words: 1971

2.013 Mikeys
#9. Variational Bayesian modelling of mixed-effects
Jean Daunizeau
This note is concerned with an accurate and computationally efficient variational bayesian treatment of mixed-effects modelling. We focus on group studies, i.e. empirical studies that report multiple measurements acquired in multiple subjects. When approached from a bayesian perspective, such mixed-effects models typically rely upon a hierarchical generative model of the data, whereby both within- and between-subject effects contribute to the overall observed variance. The ensuing VB scheme can be used to assess statistical significance at the group level and/or to capture inter-individual differences. Alternatively, it can be seen as an adaptive regularization procedure, which iteratively learns the corresponding within-subject priors from estimates of the group distribution of effects of interest (cf. so-called "empirical bayes" approaches). We outline the mathematical derivation of the ensuing VB scheme, whose open-source implementation is available as part the VBA toolbox.
more | pdf | html
Figures
Tweets
arxivml: "Variational Bayesian modelling of mixed-effects", Jean Daunizeau https://t.co/iJFxvtSsgH
arxiv_cs_LG: Variational Bayesian modelling of mixed-effects. Jean Daunizeau https://t.co/MAOfdBSrrp
Memoirs: Variational Bayesian modelling of mixed-effects. https://t.co/ISQilmi06j
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 3463
Unqiue Words: 1033

2.013 Mikeys
#10. Latent Simplex Position Model: High Dimensional Multi-view Clustering with Uncertainty Quantification
Leo L Duan
High dimensional data often contain multiple facets, and several clustering patterns (views) can co-exist under different feature subspaces. While multi-view clustering algorithms were proposed, the uncertainty quantification remains difficult --- a particular challenge is in the high complexity of estimating the cluster assignment probability under each view, or/and to efficiently share information across views. In this article, we propose an empirical Bayes approach --- viewing the similarity matrices generated over subspaces as rough first-stage estimates for co-assignment probabilities, in its Kullback-Leibler neighborhood we obtain a refined low-rank soft cluster graph, formed by the pairwise product of simplex coordinates. Interestingly, each simplex coordinate directly encodes the cluster assignment uncertainty. For multi-view clustering, we equip each similarity matrix with a mixed membership over a small number of latent views, leading to effective dimension reduction. With a high model flexibility, the estimation can be...
more | pdf | html
Figures
Tweets
arxivml: "Latent Simplex Position Model: High Dimensional Multi-view Clustering with Uncertainty Quantification", Leo L Duan https://t.co/bqu6d1FgJL
arxiv_cs_LG: Latent Simplex Position Model: High Dimensional Multi-view Clustering with Uncertainty Quantification. Leo L Duan https://t.co/oVluVHYWsh
Memoirs: Latent Simplex Position Model: High Dimensional Multi-view Clustering with Uncertainty Quantification. https://t.co/A813eWnbQO
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 6974
Unqiue Words: 2213

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 99,586 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 99,586 papers.