Clinical prediction models (CPMs) are used to predict clinically relevant
outcomes or events. Typically, prognostic CPMs are derived to predict the risk
of a single future outcome. However, with rising emphasis on the prediction of
multi-morbidity, there is growing need for CPMs to simultaneously predict risks
for each of multiple future outcomes. A common approach to multi-outcome risk
prediction is to derive a CPM for each outcome separately, then multiply the
predicted risks. This approach is only valid if the outcomes are conditionally
independent given the covariates, and it fails to exploit the potential
relationships between the outcomes. This paper outlines several approaches that
could be used to develop prognostic CPMs for multiple outcomes. We consider
four methods, ranging in complexity and assumed conditional independence
assumptions: namely, probabilistic classifier chain, multinomial logistic
regression, multivariate logistic regression, and a Bayesian probit model.
These are compared with methods that rely on...

more |
pdf
| html
None.

MatthewSperrin:
Preprint: Clinical Prediction Models to Predict the Risk of Multiple Binary
Outcomes: a comparison of approaches
https://t.co/mbmdDlMMzO work with @glen_martin1 @Richard_D_Riley @Kym_Snell and @profbuchan - comments welcome!

StatsPapers:
Clinical Prediction Models to Predict the Risk of Multiple Binary Outcomes: a comparison of approaches. https://t.co/zyQT62VyBP

None.

None.

Sample Sizes : None.

Authors: 5

Total Words: 0

Unqiue Words: 0

When constructing a model to estimate the causal effect of a treatment, it is
necessary to control for other factors which may have confounding effects.
Because the ignorability assumption is not testable, however, it is usually
unclear which set of controls is appropriate, and effect estimation is
generally sensitive to this choice. A common approach in this case is to fit
several models, each with a different set of controls, but it is difficult to
reconcile inference under the multiple resulting posterior distributions for
the treatment effect. Therefore we propose a two-stage approach to measure the
sensitivity of effect estimation with respect to control specification. In the
first stage, a model is fit with all available controls using a prior carefully
selected to adjust for confounding. In the second stage, posterior
distributions are calculated for the treatment effect under nested sets of
controls by propagating posterior uncertainty in the original model. We
demonstrate how our approach can be used to detect the most...

more |
pdf
| html
None.

StatsPapers:
Bayesian inference for treatment effects under nested subsets of controls. https://t.co/PTcwQhNOkK

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 0

Unqiue Words: 0

A key difficulty that arises from real event data is imprecision in the
recording of event time-stamps. In many cases, retaining event times with a
high precision is expensive due to the sheer volume of activity. Combined with
practical limits on the accuracy of measurements, aggregated data is common. In
order to use point processes to model such event data, tools for handling
parameter estimation are essential. Here we consider parameter estimation of
the Hawkes process, a type of self-exciting point process that has found
application in the modeling of financial stock markets, earthquakes and social
media cascades. We develop a novel optimization approach to parameter
estimation of aggregated Hawkes processes using a Monte Carlo
Expectation-Maximization (MC-EM) algorithm. Through a detailed simulation
study, we demonstrate that existing methods are capable of producing severely
biased and highly variable parameter estimates and that our novel MC-EM method
significantly outperforms them in all studied circumstances. These...

more |
pdf
| html
None.

eakcohen:
New preprint from Leigh, another one of my brilliant PhD students. Understanding the repercussions of aggregating/binning event data and developing methodologies for handling data of this type is one of the areas my group is currently interested in https://t.co/LjD278fpiU https://t.co/s8ZVvbRdOZ

StatsPapers:
A Monte Carlo EM Algorithm for the Parameter Estimation of Aggregated Hawkes Processes. https://t.co/5P48bpfIcR

None.

None.

Sample Sizes : None.

Authors: 4

Total Words: 0

Unqiue Words: 0

With multiple potential mediators on the causal pathway from a treatment to
an outcome, we consider the problem of decomposing the effects along multiple
possible causal path(s) through each distinct mediator. Under Pearl's
path-specific effects framework (Pearl, 2001; Avin et al., 2005), such
fine-grained decompositions necessitate stringent assumptions, such as
correctly specifying the causal structure among the mediators, and there being
no unobserved confounding among the mediators. In contrast, interventional
direct and indirect effects for multiple mediators (Vansteelandt and Daniel,
2017) can be identified under much weaker conditions, while providing
scientifically relevant causal interpretations. Nonetheless, current estimation
approaches require (correctly) specifying a model for the joint mediator
distribution, which can be difficult when there is a high-dimensional set of
possibly continuous and non-continuous mediators. In this article, we avoid the
need for modeling this distribution, by building on a definition...

more |
pdf
| html
None.

StatsPapers:
Non-linear Mediation Analysis with High-dimensional Mediators whose Causal Structure is Unknown. https://t.co/35VbZZlRJt

None.

None.

Sample Sizes : None.

Authors: 4

Total Words: 0

Unqiue Words: 0

Correlation and smoothness are terms used to describe a wide variety of
random quantities. In time, space, and many other domains, they both imply the
same idea: quantities that occur closer together are more similar than those
further apart. Two popular statistical models that represent this idea are
basis-penalty smoothers (Wood, 2017) and stochastic partial differential
equations (SPDE) (Lindgren et al., 2011). In this paper, we discuss how the
SPDE can be interpreted as a smoothing penalty and can be fitted using the R
package mgcv, allowing practitioners with existing knowledge of smoothing
penalties to better understand the implementation and theory behind the SPDE
approach.

more |
pdf
| html
None.

StatsPapers:
Understanding the stochastic partial differential equation approach to smoothing. https://t.co/hc2lM27uK4

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 7608

Unqiue Words: 2248

In regression models for spatial data, it is often assumed that the marginal
effects of covariates on the response are constant over space. In practice,
this assumption might often be questionable. In this article, we show how a
Gaussian process-based spatially varying coefficient (SVC) model can be
estimated using maximum likelihood estimation (MLE). In addition, we present an
approach that scales to large data by applying covariance tapering. We compare
our methodology to existing methods such as a Bayesian approach using the
stochastic partial differential equation (SPDE) link, geographically weighted
regression (GWR), and eigenvector spatial filtering (ESF) in both a simulation
study and an application where the goal is to predict prices of real estate
apartments in Switzerland. The results from both the simulation study and
application show that the MLE approach results in increased predictive accuracy
and more precise estimates. Since we use a model-based approach, we can also
provide predictive variances. In contrast to...

more |
pdf
| html
None.

StatsPapers:
Maximum Likelihood Estimation of Spatially Varying Coefficient Models for Large Data with an Application to Real Estate Price Prediction. https://t.co/M6P301Bzuh

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 0

Unqiue Words: 0

When statistical analyses consider multiple data sources, Markov melding
provides a method for combining the source-specific Bayesian models. Models
often contain different quantities of information due to variation in the
richness of model-specific data, or availability of model-specific prior
information. We show that this can make the multi-stage Markov chain Monte
Carlo sampler employed by Markov melding unstable and unreliable. We propose a
robust multi-stage algorithm that estimates the required prior marginal
self-density ratios using weighted samples, dramatically improving accuracy in
the tails of the distribution, thus stabilising the algorithm and providing
reliable inference. We demonstrate our approach using an evidence synthesis for
inferring HIV prevalence, and an evidence synthesis of A/H1N1 influenza.

more |
pdf
| html
None.

StatsPapers:
A numerically stable algorithm for integrating Bayesian models using Markov melding. https://t.co/WEeo0BzJGt

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 0

Unqiue Words: 0

Deep learning methods are the gold standard for non-linear statistical
modeling in computer vision and in natural language processing but are rarely
used in psychometrics. To bridge this gap, we present a novel deep learning
algorithm for exploratory item factor analysis (IFA). Our approach combines a
deep artificial neural network (ANN) model called a variational autoencoder
(VAE) with recent work that uses regularization for exploratory factor
analysis. We first provide overviews of ANNs and VAEs. We then describe how to
conduct exploratory IFA with a VAE and demonstrate our approach in two
empirical examples and in two simulated examples. Our empirical results were
consistent with existing psychological theory across random starting values.
Our simulations suggest that the VAE consistently recovers the data generating
factor pattern with moderate-sized samples. Secondary loadings were
underestimated with a complex factor structure and intercept parameter
estimates were moderately biased with both simple and complex...

more |
pdf
| html
None.

BrundageBot:
A Deep Learning Algorithm for High-Dimensional Exploratory Item Factor Analysis. Christopher J. Urban and Daniel J. Bauer https://t.co/wlesX9lrXX

arxiv_cs_LG:
A Deep Learning Algorithm for High-Dimensional Exploratory Item Factor Analysis. Christopher J. Urban and Daniel J. Bauer https://t.co/AWvFMsALQy

StatsPapers:
A Deep Learning Algorithm for High-Dimensional Exploratory Item Factor Analysis. https://t.co/B4tvtnYArZ

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 0

Unqiue Words: 0

We consider the problem of assessing the importance of multiple variables or
factors from a dataset when side information is available. In principle, using
side information can allow the statistician to pay attention to variables with
a greater potential, which in turn, may lead to more discoveries. We introduce
an adaptive knockoff filter, which generalizes the knockoff procedure (Barber
and Cand\`es, 2015; Cand\`es et al., 2018) in that it uses both the data at
hand and side information to adaptively order the variables under study and
focus on those that are most promising. Adaptive knockoffs controls the
finite-sample false discovery rate (FDR) and we demonstrate its power by
comparing it with other structured multiple testing methods. We also apply our
methodology to real genetic data in order to find associations between genetic
variants and various phenotypes such as Crohn's disease and lipid levels. Here,
adaptive knockoffs makes more discoveries than reported in previous studies on
the same datasets.

more |
pdf
| html
None.

StatsPapers:
Knockoffs with Side Information. https://t.co/HZ2OylEJ01

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 0

Unqiue Words: 0

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

*Tracking 257,103 papers.*

Sort results based on if they are interesting or reproducible.

Interesting

Reproducible