New methods are proposed for adjusting probabilistic forecasts to ensure
coherence with the aggregation constraints inherent in temporal hierarchies.
The different approaches nested within this framework include methods that
exploit information at all levels of the hierarchy as well as a novel method
based on cross-validation. The methods are evaluated using real data from two
wind farms in Crete, an application where it is imperative for optimal
decisions related to grid operations and bidding strategies to be based on
coherent probabilistic forecasts of wind power. Empirical evidence is also
presented showing that probabilistic forecast reconciliation improves the
accuracy of both point forecasts and probabilistic forecasts.

Authors: 3

Total Words: 10643

Unqiue Words: 2571

This paper provides estimation and inference methods for a structural
function, such as Conditional Average Treatment Effect (CATE), based on modern
machine learning (ML) tools. We assume that such function can be represented as
an expectation g(x) of a signal Y conditional on X that depends on an unknown
nuisance function. In addition to CATE, examples of such functions include
regression function with Partially Missing Outcome and Conditional Average
Partial Derivative. We approximate g(x) by a linear form that is a product of a
vector of the approximating basis functions p(x) and the Best Linear Predictor
(BLP), which we refer to a pseudo-target. Plugging in the first-stage estimate
of the nuisance function into the signal, we estimate BLP via ordinary least
squares. We deliver a high-quality estimate of the pseudo-target function that
features (a) a pointwise Gaussian approximation, (b) a simultaneous Gaussian
approximation, and (c) optimal rate of simultaneous convergence. In the case,
the misspecification error of the linear...

Authors: 2

Total Words: 15984

Unqiue Words: 3104

We investigate the low-dimensional structure of deterministic transformations
between random variables, i.e., transport maps between probability measures. In
the context of statistics and machine learning, these transformations can be
used to couple a tractable "reference" measure (e.g., a standard Gaussian) with
a target measure of interest. Direct simulation from the desired measure can
then be achieved by pushing forward reference samples through the map. Yet
characterizing such a map---e.g., representing and evaluating it---grows
challenging in high dimensions. The central contribution of this paper is to
establish a link between the Markov properties of the target measure and the
existence of low-dimensional couplings, induced by transport maps that are
sparse and/or decomposable. Our analysis not only facilitates the construction
of transformations in high-dimensional settings, but also suggests new
inference methodologies for continuous non-Gaussian graphical models. For
instance, in the context of nonlinear state-space...

Authors: 3

Total Words: 33261

Unqiue Words: 5530

In this work, we propose JSDMs where the responses to environmental
covariates are modeled with multivariate additive Gaussian processes. These
allow inference for wide range of functional forms and interspecific
correlations between the responses. We propose also an efficient approach for
inference by utilizing Laplace approximation with a parameterization of the
interspecific covariance matrices on the euclidean space. We demonstrate the
benefits of our model with two small scale examples and one real world case
study. We use cross-validation to compare the proposed model to analogous
single species models in interpolation and extrapolation tasks. The proposed
model outperforms the single species models in both cases. We also show that
the proposed model can be seen as an extension of the current state-of-the-art
JSDMs to semiparametric models.

Authors: 3

Total Words: 16757

Unqiue Words: 4041

We propose a prior distribution for the number of components of a finite
mixture model. The novelty is that the prior distribution is obtained by
considering the loss one would incur if the true value representing the number
of components were not considered. The prior has an elegant and easy to
implement structure, which allows to naturally include any prior information
one may have as well as to opt for a default solution in cases where this
information is not available. The performance of the prior, and comparison with
existing alternatives, is studied through the analysis of both real and
simulated data.

Authors: 3

Total Words: 4765

Unqiue Words: 1566

The Wallenius distribution is a generalisation of the Hypergeometric
distribution where weights are assigned to balls of different colours. This
naturally defines a model for ranking categories which can be used for
classification purposes. Since, in general, the resulting likelihood is not
analytically available, we adopt an approximate Bayesian computational (ABC)
approach for estimating the importance of the categories. We illustrate the
performance of the estimation procedure on simulated datasets. Finally, we use
the new model for analysing two datasets about movies ratings and Italian
academic statisticians' journal preferences. The latter is a novel dataset
collected by the authors.

Authors: 3

Total Words: 7703

Unqiue Words: 2435

Observed multidimensional network data can have different levels of
complexity, as nodes may be characterized by heterogeneous individual-specific
features. Also, such characteristics may vary across the networks. This article
discusses a novel class of models for multidimensional networks, able to deal
with different levels of heterogeneity within and between networks. The
proposed framework is developed within the family of latent space models, in
order to distinguish recurrent symmetrical relations between the nodes from
node-specific features in the different views. Models parameters are estimated
via a Markov Chain Monte Carlo algorithm. Simulated data and also FAO fruits
import/export data are analysed to illustrate the performances of the proposed
models.

Authors: 3

Total Words: 13715

Unqiue Words: 2522

We propose a general new method, the \emph{conditional permutation test}, for
testing the conditional independence of variables $X$ and $Y$ given a
potentially high-dimensional random vector $Z$ that may contain confounding
factors. The proposed test permutes entries of $X$ non-uniformly, so as to
respect the existing dependence between $X$ and $Z$ and thus account for the
presence of these confounders. Like the conditional randomization test of
\citet{candes2018panning}, our test relies on the availability of an
approximation to the distribution of $X \mid Z$---while
\citet{candes2018panning}'s test uses this estimate to draw new $X$ values, for
our test we use this approximation to design an appropriate non-uniform
distribution on permutations of the $X$ values already seen in the true data.
We provide an efficient Markov Chain Monte Carlo sampler for the implementation
of our method, and establish bounds on the Type~I error in terms of the error
in the approximation of the conditional distribution of $X\mid Z$, finding
that,...

Authors: 4

Total Words: 10934

Unqiue Words: 2331

Method comparison studies are essential for development in medical and
clinical fields. These studies often compare a cheaper, faster, or less
invasive measuring method with a widely used one to see if they have sufficient
agreement for interchangeable use. In the clinical and medical context, the
response measurement is usually impacted not only by the measuring method but
by the rater as well. This paper proposes a model-based approach to assess
agreement of two measuring methods for paired repeated binary measurements
under the scenario when the agreement between two measuring methods and the
agreement among raters are required to be studied in a unified framework. Based
upon the generalized linear mixed models (GLMM), the decision on the adequacy
of interchangeable use is made by testing the equality of fixed effects of
methods. Approaches for assessing method agreement, such as the Bland-Altman
diagram and Cohen's kappa, are also developed for repeated binary measurements
based upon the latent variables in GLMMs. We assess...

Authors: 4

Total Words: 7250

Unqiue Words: 1853

We propose an algorithm that is capable of imposing shape constraints on
regression curves, without requiring the constraints to be written as
closed-form expressions, nor assuming the functional form of the loss function.
Our algorithm, which is based on Sequential Monte Carlo-Simulated Annealing,
only relies on an indicator function that assesses whether or not the
constraints are fulfilled, thus allowing us to enforce various complex
constraints by specifying an appropriate indicator function without altering
other parts of the algorithm. We demonstrate our algorithm by fitting rational
function models subject to monotonicity and continuity constraints. The
algorithm was implemented using R (R Core Team, 2018) and the code is freely
available on GitHub.

Authors: 3

Total Words: 5846

Unqiue Words: 1872

