##### #1. Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics
###### Guido W. Imbens
In this essay I discuss potential outcome and graphical approaches to causality, and their relevance for empirical work in economics. I review some of the work on directed acyclic graphs, including the recent "The Book of Why," by Pearl and MacKenzie. I also discuss the potential outcome framework developed by Rubin and coauthors, building on work by Neyman. I then discuss the relative merits of these approaches for empirical work in economics, focusing on the questions each answer well, and why much of the the work in economics is closer in spirit to the potential outcome framework.
##### #2. Amortized Monte Carlo Integration
###### Adam Goliński, Frank Wood, Tom Rainforth
Current approaches to amortizing Bayesian inference focus solely on approximating the posterior distribution. Typically, this approximation is, in turn, used to calculate expectations for one or more target functions - a computational pipeline which is inefficient when the target function(s) are known upfront. In this paper, we address this inefficiency by introducing AMCI, a method for amortizing Monte Carlo integration directly. AMCI operates similarly to amortized inference but produces three distinct amortized proposals, each tailored to a different component of the overall expectation calculation. At runtime, samples are produced separately from each amortized proposal, before being combined to an overall estimate of the expectation. We show that while existing approaches are fundamentally limited in the level of accuracy they can achieve, AMCI can theoretically produce arbitrarily small errors for any integrable target function using only a single sample from each proposal at runtime. We further show that it is able to...
##### #3. Robust data-driven discovery of governing physical laws using a new subsampling-based sparse Bayesian method to tackle four challenges (large noise, outliers, data integration, and extrapolation)
###### Sheng Zhang, Guang Lin
The derivation of physical laws is a dominant topic in scientific research. We propose a new method capable of discovering the physical laws from data to tackle four challenges in the previous methods. The four challenges are: (1) large noise in the data, (2) outliers in the data, (3) integrating the data collected from different experiments, and (4) extrapolating the solutions to the areas that have no available data. To resolve these four challenges, we try to discover the governing differential equations and develop a model-discovering method based on sparse Bayesian inference and subsampling. The subsampling technique is used for improving the accuracy of the Bayesian learning algorithm here, while it is usually employed for estimating statistics or speeding up algorithms elsewhere. The optimal subsampling size is moderate, neither too small nor too big. Another merit of our method is that it can work with limited data by the virtue of Bayesian inference. We demonstrate how to use our method to tackle the four aforementioned...
##### #4. optimalFlow: Optimal-transport approach to flow cytometry gating and population matching
###### Eustasio del Barrio, Hristo Inouzhe, Jean-Michel Loubes, Carlos Matrán, Agustín Mayo-Íscar
Data used in Flow Cytometry present pronounced variability due to biological and technical reasons. Biological variability is a well known phenomenon produced by measurements on different individuals, with different characteristics such as age, sex, etc... The use of different settings for measurement, the variation of the conditions during experiments or the different types of flow cytometers are some of the technical sources of variability. This high variability makes difficult the use of supervised machine learning for identification of cell populations. We propose optimalFlowTemplates, based on a similarity distance and Wasserstein barycenters, which clusterizes cytometries and produces prototype cytometries for the different groups. We show that supervised learning restricted to the new groups performs better than the same techniques applied to the whole collection. We also present optimalFlowClassification, which uses a database of gated cytometries and optimalFlowTemplates to assign cell types to a new cytometry. We show...
##### #5. Least Angle Regression in Tangent Space and LASSO for Generalized Linear Model
###### Yoshihiro Hirose
We propose sparse estimation methods for the generalized linear models, which run Least Angle Regression (LARS) and Least Absolute Shrinkage and Selection Operator (LASSO) in the tangent space of the manifold of the statistical model. Our approach is to roughly approximate the statistical model and to subsequently use exact calculations. LARS was proposed as an efficient algorithm for parameter estimation and variable selection for the normal linear model. The LARS algorithm is described in terms of Euclidean geometry with regarding correlation as metric of the space. Since the LARS algorithm only works in Euclidean space, we transform a manifold of the statistical model into the tangent space at the origin. In the generalized linear regression, this transformation allows us to run the original LARS algorithm for the generalized linear models. The proposed methods are efficient and perform well. Real-data analysis shows that the proposed methods output similar results as that of the $l_1$-penalized maximum likelihood estimation...
##### #6. An Adaptive Approach for Anomaly Detector Selection and Fine-Tuning in Time Series
###### Hui Ye, Xiaopeng Ma, Qingfeng Pan, Huaqiang Fang, Hang Xiang, Tongzhen Shao
The anomaly detection of time series is a hotspot of time series data mining. The own characteristics of different anomaly detectors determine the abnormal data that they are good at. There is no detector can be optimizing in all types of anomalies. Moreover, it still has difficulties in industrial production due to problems such as a single detector can't be optimized at different time windows of the same time series. This paper proposes an adaptive model based on time series characteristics and selecting appropriate detector and run-time parameters for anomaly detection, which is called ATSDLN(Adaptive Time Series Detector Learning Network). We take the time series as the input of the model, and learn the time series representation through FCN. In order to realize the adaptive selection of detectors and run-time parameters according to the input time series, the outputs of FCN are the inputs of two sub-networks: the detector selection network and the run-time parameters selection network. In addition, the way that the variable...
##### #7. A discriminative approach for finding and characterizing positivity violations using decision trees
###### Ehud Karavani, Peter Bak, Yishai Shimoni
The assumption of positivity in causal inference (also known as common support and co-variate overlap) is necessary to obtain valid causal estimates. Therefore, confirming it holds in a given dataset is an important first step of any causal analysis. Most common methods to date are insufficient for discovering non-positivity, as they do not scale for modern high-dimensional covariate spaces, or they cannot pinpoint the subpopulation violating positivity. To overcome these issues, we suggest to harness decision trees for detecting violations. By dividing the covariate space into mutually exclusive regions, each with maximized homogeneity of treatment groups, decision trees can be used to automatically detect subspaces violating positivity. By augmenting the method with an additional random forest model, we can quantify the robustness of the violation within each subspace. This solution is scalable and provides an interpretable characterization of the subspaces in which violations occur. We provide a visualization of the...
##### #8. Scalar-on-function local linear regression and beyond
###### Frédéric Ferraty, Stanislav Nagy
Regressing a scalar response on a random function is nowadays a common situation. In the nonparametric setting, this paper paves the way for making the local linear regression based on a projection approach a prominent method for solving this regression problem. Our asymptotic results demonstrate that the functional local linear regression outperforms its functional local constant counterpart. Beyond the estimation of the regression operator itself, the local linear regression is also a useful tool for predicting the functional derivative of the regression operator, a promising mathematical object on its own. The local linear estimator of the functional derivative is shown to be consistent. On simulated datasets we illustrate good finite sample properties of both proposed methods. On a real data example of a single-functional index model we indicate how the functional derivative of the regression operator provides an original and fast, widely applicable estimating method.
##### #9. Sequential Pattern mining of Longitudinal Adverse Events After Left Ventricular Assist Device Implant
###### Faezeh Movahedi, Robert L. Kormos, Lisa Lohmueller, Laura Seese, Manreet Kanwar, Srinivas Murali, Yiye Zhang, Rema Padman, James F. Antaki
Left ventricular assist devices (LVADs) are an increasingly common therapy for patients with advanced heart failure. However, implantation of the LVAD increases the risk of stroke, infection, bleeding, and other serious adverse events (AEs). Most post-LVAD AEs studies have focused on individual AEs in isolation, neglecting the possible interrelation, or causality between AEs. This study is the first to conduct an exploratory analysis to discover common sequential chains of AEs following LVAD implantation that are correlated with important clinical outcomes. This analysis was derived from 58,575 recorded AEs for 13,192 patients in International Registry for Mechanical Circulatory Support (INTERMACS) who received a continuousflow LVAD between 2006 and 2015. The pattern mining procedure involved three main steps: (1) creating a bank of AE sequences by converting the AEs for each patient into a single, chronologically sequenced record, (2) grouping patients with similar AE sequences using hierarchical clustering, and (3) extracting...
##### #10. Self Organizing Supply Chains for Micro-Prediction: Present and Future uses of the ROAR Protocol
###### Peter Cotton
A multi-agent system is trialed as a means of crowd-sourcing inexpensive but high quality streams of predictions. Each agent is a microservice embodying statistical models and endowed with economic self-interest. The ability to fork and modify simple agents is granted to a large number of employees in a firm and empirical lessons are reported. We suggest that one plausible trajectory for this project is the creation of a Prediction Web.
