Top 7 Arxiv Papers Today in Methodology


2.033 Mikeys
#1. Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics
Guido W. Imbens
In this essay I discuss potential outcome and graphical approaches to causality, and their relevance for empirical work in economics. I review some of the work on directed acyclic graphs, including the recent "The Book of Why," by Pearl and MacKenzie. I also discuss the potential outcome framework developed by Rubin and coauthors, building on work by Neyman. I then discuss the relative merits of these approaches for empirical work in economics, focusing on the questions each answer well, and why much of the the work in economics is closer in spirit to the potential outcome framework.
more | pdf | html
Figures
None.
Tweets
ThomasVConti: Leitura recomendada do dia, artigo novo e importante do Imbens sobre frameworks de inferência causal para pesquisas em economia: "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics" https://t.co/9YbFMG4BLb
Kweku_OA: New paper by Guido Imbens: "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics" (paper: https://t.co/DYGdp7xRq2)
autoregress: Fun to see snippets of twitter convos with @eliasbareinboim, @PHuenermund @Jabaluck et al incorporated in this new Imbens paper. https://t.co/NeFCRAF74M Almost certain the conversation is far from over... :) https://t.co/JzaDx5stCa
hmmlowe: This is a really great read for (a) economists new to DAGs (b) economists wanting to see twitter debates referenced in academic papers Next step: @Jabaluck should include his tweets on google scholar https://t.co/NkBsnsrZWd https://t.co/65Evmd5LlG
d_f_stone: @thosjleeper @kmmunger Good critique/discussion by Imbens here- https://t.co/qPRfUVkkGk (ht @estebanjq3)
chrisbboyer: Imbens on DAGs... https://t.co/lZr2RNW6BS https://t.co/PfX6lux0P6
StatsPapers: Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics. https://t.co/0ML7IYtkSy
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 23917
Unqiue Words: 5200

2.027 Mikeys
#2. Scalar-on-function local linear regression and beyond
Frédéric Ferraty, Stanislav Nagy
Regressing a scalar response on a random function is nowadays a common situation. In the nonparametric setting, this paper paves the way for making the local linear regression based on a projection approach a prominent method for solving this regression problem. Our asymptotic results demonstrate that the functional local linear regression outperforms its functional local constant counterpart. Beyond the estimation of the regression operator itself, the local linear regression is also a useful tool for predicting the functional derivative of the regression operator, a promising mathematical object on its own. The local linear estimator of the functional derivative is shown to be consistent. On simulated datasets we illustrate good finite sample properties of both proposed methods. On a real data example of a single-functional index model we indicate how the functional derivative of the regression operator provides an original and fast, widely applicable estimating method.
more | pdf | html
Figures
None.
Tweets
ArtofWarm: Scalar-on-function local linear regression and beyond https://t.co/rHB5JEGfds https://t.co/dFlNLGG7pT
StatsPapers: Scalar-on-function local linear regression and beyond. https://t.co/6sy2kSYV4r
cristobalvega: RT @StatsPapers: Scalar-on-function local linear regression and beyond. https://t.co/6sy2kSYV4r
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 21670
Unqiue Words: 4238

2.006 Mikeys
#3. A Multivariate Extreme Value Theory Approach to Anomaly Clustering and Visualization
Maël Chiapino, Stéphan Clémençon, Vincent Feuillard, Anne Sabourin
In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector X = (X1,. .. , X d) valued in R d , correspond to the simultaneous occurrence of extreme values for certain subgroups $\alpha$ $\subset$ {1,. .. , d} of variables Xj. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type $\alpha$ is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type $\alpha$, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations...
more | pdf | html
Figures
None.
Tweets
arxivml: "A Multivariate Extreme Value Theory Approach to Anomaly Clustering and Visualization", Maël Chiapino, Stéphan Clém… https://t.co/O36hBvfPJ0
StatsPapers: A Multivariate Extreme Value Theory Approach to Anomaly Clustering and Visualization. https://t.co/BmBFSs6bJ3
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 0
Unqiue Words: 0

2.003 Mikeys
#4. Application of Cox Model to predict the survival of patients with Chronic Heart Failure: A latent class regression approach
John Mbotwa, Marc de Kamps, Paul D. Baxter, Mark S. Gilthorpe
Most prediction models that are used in medical research fail to accurately predict health outcomes due to methodological limitations. Using routinely collected patient data, we explore the use of a Cox proportional hazard (PH) model within a latent class framework to model survival of patients with chronic heart failure (CHF). We identify subgroups of patients based on their risk with the aid of available covariates. We allow each subgroup to have its own risk model.We choose an optimum number of classes based on the reported Bayesian information criteria (BIC). We assess the discriminative ability of the chosen model using an area under the receiver operating characteristic curve (AUC) for all the cross-validated and bootstrapped samples.We conduct a simulation study to compare the predictive performance of our models. Our proposed latent class model outperforms the standard one class Cox PH model.
more | pdf | html
Figures
Tweets
StatsPapers: Application of Cox Model to predict the survival of patients with Chronic Heart Failure: A latent class regression approach. https://t.co/hAluQ4AA1R
Github
None.
Youtube
None.
Other stats
Sample Sizes : [11]
Authors: 4
Total Words: 6287
Unqiue Words: 2016

2.001 Mikeys
#5. Factor copula models for mixed data
Sayed H. Kadhem, Aristidis K. Nikoloulopoulos
We develop factor copula models for analysing the dependence among mixed continuous and discrete responses. Factor copula models are canonical vine copulas that involve both observed and latent variables, hence they allow tail, asymmetric and non-linear dependence. They can be explained as conditional independence models with latent variables that don't necessarily have an additive latent structure. We focus on important issues that would interest the social data analyst, such as model selection and goodness-of-fit. Our general methodology is demonstrated with an extensive simulation study and illustrated by re-analysing three mixed response datasets. Our study suggests that there can be a substantial improvement over the standard factor model for mixed data and makes the argument for moving to factor copula models.
more | pdf | html
Figures
None.
Tweets
StatsPapers: Factor copula models for mixed data. https://t.co/ROEVLsql1E
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 11198
Unqiue Words: 2866

2.001 Mikeys
#6. Optimal Sampling for Generalized Linear Models under Measurement Constraints
Tao Zhang, Yang Ning, David Ruppert
Suppose we are using a generalized linear model to predict a scalar outcome $Y$ given a covariate vector $X$. We consider two related problems and propose a methodology for both. In the first problem, every data point in a large dataset has both $Y$ and $X$ known, but we wish to use a subset of the data to limit computational costs. In the second problem, sometimes call "measurement constraints," $Y$ is expensive to measure and initially is available only for a small portion of the data. The goal is to select another subset of data where $Y$ will also be measured. We focus on the more challenging but less well-studied measurement constraint problem. A popular approach for the first problem is sampling. However, most existing sampling algorithms require $Y$ is measured at all data points, so they cannot be used under measurement constraints. We propose an optimal sampling procedure for massive datasets under measurement constraints (OSUMC). We show consistency and asymptotic normality of estimators from a general class of sampling...
more | pdf | html
Figures
None.
Tweets
StatsPapers: Optimal Sampling for Generalized Linear Models under Measurement Constraints. https://t.co/vNGW0ndt4v
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

2.001 Mikeys
#7. Assessing Treatment Effect Variation in Observational Studies: Results from a Data Challenge
Carlos Carvalho, Avi Feller, Jared Murray, Spencer Woody, David Yeager
A growing number of methods aim to assess the challenging question of treatment effect variation in observational studies. This special section of "Observational Studies" reports the results of a workshop conducted at the 2018 Atlantic Causal Inference Conference designed to understand the similarities and differences across these methods. We invited eight groups of researchers to analyze a synthetic observational data set that was generated using a recent large-scale randomized trial in education. Overall, participants employed a diverse set of methods, ranging from matching and flexible outcome modeling to semiparametric estimation and ensemble approaches. While there was broad consensus on the topline estimate, there were also large differences in estimated treatment effect moderation. This highlights the fact that estimating varying treatment effects in observational studies is often more challenging than estimating the average treatment effect alone. We suggest several directions for future work arising from this workshop.
more | pdf | html
Figures
Tweets
StatsPapers: Assessing Treatment Effect Variation in Observational Studies: Results from a Data Challenge. https://t.co/cM74cYd8TN
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 5791
Unqiue Words: 2003

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 160,428 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 160,428 papers.