##### #1. Modelling Diffusion through Statistical Network Analysis: A Simulation Study
###### Johan A. Elkink, Thomas U. Grund
The study of international relations by definition deals with interdependencies among countries. One form of interdependence between countries is the diffusion of country-level features, such as policies, political regimes, or conflict. In these studies, the outcome variable tends to be categorical, and the primary concern is the clustering of the outcome variable among connected countries. Statistically, such clustering is studied with spatial econometric models. This paper instead proposes the use of a statistical network approach to model diffusion with a binary outcome variable. Using statistical network instead of spatial econometric models allows for a more natural specification of the diffusion process, assuming autocorrelation in the outcomes rather than the corresponding latent variable, and it simplifies the inclusion of temporal dynamics, higher level interdependencies and interactions between network ties and country-level features. In our simulations, the performance of the Stochastic Actor-Oriented Model...
##### #2. Prescriptive Cluster-Dependent Support Vector Machines with an Application to Reducing Hospital Readmissions
###### Taiyao Wang, Ioannis Ch. Paschalidis
We augment linear Support Vector Machine (SVM) classifiers by adding three important features: (i) we introduce a regularization constraint to induce a sparse classifier; (ii) we devise a method that partitions the positive class into clusters and selects a sparse SVM classifier for each cluster; and (iii) we develop a method to optimize the values of controllable variables in order to reduce the number of data points which are predicted to have an undesirable outcome, which, in our setting, coincides with being in the positive class. The latter feature leads to personalized prescriptions/recommendations. We apply our methods to the problem of predicting and preventing hospital readmissions within 30-days from discharge for patients that underwent a general surgical procedure. To that end, we leverage a large dataset containing over 2.28 million patients who had surgeries in the period 2011--2014 in the U.S. The dataset has been collected as part of the American College of Surgeons National Surgical Quality Improvement Program (NSQIP).
##### #3. Large-Scale Online Experimentation with Quantile Metrics
###### Min Liu, Xiaohui Sun, Maneesh Varshney, Ya Xu
Online experimentation (or A/B testing) has been widely adopted in industry as the gold standard for measuring product impacts. Despite the wide adoption, few literatures discuss A/B testing with quantile metrics. Quantile metrics, such as 90th percentile page load time, are crucial to A/B testing as many key performance metrics including site speed and service latency are defined as quantiles. However, with LinkedIn's data size, quantile metric A/B testing is extremely challenging because there is no statistically valid and scalable variance estimator for the quantile of dependent samples: the bootstrap estimator is statistically valid, but takes days to compute; the standard asymptotic variance estimate is scalable but results in order-of-magnitude underestimation. In this paper, we present a statistically valid and scalable methodology for A/B testing with quantiles that is fully generalizable to other A/B testing platforms. It achieves over 500 times speed up compared to bootstrap and has only $2\%$ chance to differ from...
##### #4. A Method for Measuring Network Effects of One-to-One Communication Features in Online A/B Tests
###### Guillaume Saint-Jacques, James Eric Sorenson, Nanyu Chen, Ya Xu
A/B testing is an important decision making tool in product development because can provide an accurate estimate of the average treatment effect of a new features, which allows developers to understand how the business impact of new changes to products or algorithms. However, an important assumption of A/B testing, Stable Unit Treatment Value Assumption (SUTVA), is not always a valid assumption to make, especially for products that facilitate interactions between individuals. In contexts like one-to-one messaging we should expect network interference; if an experimental manipulation is effective, behavior of the treatment group is likely to influence members in the control group by sending them messages, violating this assumption. In this paper, we propose a novel method that can be used to account for network effects when A/B testing changes to one-to-one interactions. Our method is an edge-based analysis that can be applied to standard Bernoulli randomized experiments to retrieve an average treatment effect that is not...
##### #5. Optimal Intermittent Measurements for Tumor Tracking in X-ray Guided Radiotherapy
###### Antoine Aspeel, Damien Dasnoy, Raphaël M. Jungers, Benoît Macq
In radiation therapy, tumor tracking is a challenging task that allows a better dose delivery. One practice is to acquire X-ray images in real-time during treatment, that are used to estimate the tumor location. These informations are used to predict the close future tumor trajectory. Kalman prediction is a classical approach for this task. The main drawback of X-ray acquisition is that it irradiates the patient, including its healthy tissues. In the classical Kalman framework, X-ray measurements are taken regularly, i.e. at a constant rate. In this paper, we propose a new approach which relaxes this constraint in order to take measurements when they are the most useful. Our aim is for a given budget of measurements to optimize the tracking process. This idea naturally brings to an optimal intermittent Kalman predictor for which measurement times are selected to minimize the mean squared prediction error over the complete fraction. This optimization problem can be solved directly when the respiratory model has been identified and...
##### #6. Is Basketball a Game of Runs?
###### Mark F. Schilling
Basketball is often referred to as "a game of runs." We investigate the appropriateness of this claim using data from the full NBA 2016-17 season, comparing actual longest runs of scoring events to what long run theory predicts under the assumption that team "momentum" is not present. We provide several different variations of the analysis. Our results consistently indicate that the lengths of longest runs in NBA games are no longer than those that would occur naturally when scoring events are generated by a random process, rather than one that is influenced by "momentum".
