Multivariate circular observations, i.e. points on a torus are nowadays very
common. Multivariate wrapped models are often appropriate to describe data
points scattered on p-dimensional torus. However, statistical inference based
on this model is quite complicated since each contribution in the log
likelihood involve an infinite sum of indices in Z^p where p is the dimension
of the problem. To overcome this, two estimates procedures based on Expectation
Maximization and Classification Expectation Maximization algorithms are
proposed that worked well in moderate dimension size. The performance of the
introduced methods are studied by Monte Carlo simulation and illustrated on
three real data sets.

Authors: 4

Total Words: 6907

Unqiue Words: 1847

Very large spatio-temporal lattice data are becoming increasingly common
across a variety of disciplines. However, estimating interdependence across
space and time in large areal datasets remains challenging, as existing
approaches are often (i) not scalable, (ii) designed for conditionally Gaussian
outcome data, or (iii) are limited to cross-sectional and univariate outcomes.
This paper proposes an MCEM estimation strategy for a family of latent-Gaussian
multivariate spatio-temporal models that addresses these issues. The proposed
estimator is applicable to a wide range of non-Gaussian outcomes, and
implementations for binary and count outcomes are discussed explicitly. The
methodology is illustrated on simulated data, as well as on weekly data of
IS-related events in Syrian districts.

Authors: 4

Total Words: 9948

Unqiue Words: 2831

We present a kernel-independent method that applies hierarchical matrices to
the problem of maximum likelihood estimation for Gaussian processes. The
proposed approximation provides natural and scalable stochastic estimators for
its gradient and Hessian, as well as the expected Fisher information matrix,
that are computable in quasilinear $O(n \log^2 n)$ complexity for a large range
of models. To accomplish this, we (i) choose a specific hierarchical
approximation for covariance matrices that enables the computation of their
exact derivatives and (ii) use a stabilized form of the Hutchinson stochastic
trace estimator. Since both the observed and expected information matrices can
be computed in quasilinear complexity, covariance matrices for MLEs can also be
estimated efficiently. After discussing the associated mathematics, we
demonstrate the scalability of the method, discuss details of its
implementation, and validate that the resulting MLEs and confidence intervals
based on the inverse Fisher information matrix faithfully...

Authors: 3

Total Words: 9003

Unqiue Words: 2406

This paper revisits the classic iterative proportional scaling (IPS) from a
modern optimization perspective. In contrast to the criticisms made in the
literature, we show that based on a coordinate descent characterization, IPS
can be slightly modified to deliver coefficient estimates, and from a
majorization-minimization standpoint, IPS can be extended to handle log-affine
models with features not necessarily binary-valued or nonnegative. Furthermore,
some state-of-the-art optimization techniques such as block-wise computation,
randomization and momentum-based acceleration can be employed to provide more
scalable IPS algorithms, as well as some regularized variants of IPS for
concurrent feature selection.

Authors: 2

Total Words: 11832

Unqiue Words: 3305

In the package corr2D two-dimensional correlation analysis is implemented in
R. This paper describes how two-dimensional correlation analysis is done in the
package and how the mathematical equations are translated into R code. The
paper features a simple tutorial with executable code for beginners, insight
into at the calculations done before the correlation analysis, a detailed look
at the parallelization of the fast Fourier transformation based correlation
analysis and a speed test of the calculation. The package corr2D offers the
possibility to preprocess, correlate and postprocess spectroscopic data using
exclusively the R language. Thus, corr2D is a welcome addition to the toolbox
of spectroscopists and makes two-dimensional correlation analysis more
accessible and transparent.

Authors: 4

Total Words: 13641

Unqiue Words: 3189

Previous work has demonstrated the feasibility and value of conducting
distributed regression analysis (DRA), a privacy-protecting analytic method
that performs multivariable-adjusted regression analysis with only
summary-level information from participating sites. To our knowledge, there are
no DRA applications in SAS, the statistical software used by several large
national distributed data networks (DDNs), including the Sentinel System and
PCORnet. SAS/IML is available to perform the required matrix computations for
DRA in the SAS system. However, not all data partners in these large DDNs have
access to SAS/IML, which is licensed separately. In this first article of a
two-paper series, we describe a DRA application developed for use in Base SAS
and SAS/STAT modules for linear and logistic DRA within horizontally
partitioned DDNs and its successful tests.

Authors: 7

Total Words: 13959

Unqiue Words: 3124

Previous work has demonstrated the feasibility and value of conducting
distributed regression analysis (DRA), a privacy-protecting analytic method
that performs multivariable-adjusted regression analysis with only
summary-level information from participating sites. To our knowledge, there are
no DRA applications in SAS, the statistical software used by several large
national distributed data networks (DDNs), including the Sentinel System and
PCORnet. SAS/IML is available to perform the required matrix computations for
DRA in the SAS system. However, not all data partners in these large DDNs have
access to SAS/IML, which is licensed separately. In this second article of a
two-paper series, we describe a DRA application developed using Base SAS and
SAS/STAT modules for distributed Cox proportional hazards regression within
horizontally partitioned DDNs and its successful tests.

Authors: 7

Total Words: 8517

Unqiue Words: 2160

The intra-cluster correlation coefficient (ICC) plays an important role while
designing the cluster randomized trials (CRTs). Often optimal CRTs are designed
assuming that the magnitude of the ICC is constant across the clusters.
However, this assumption is hardly satisfied. In some applications, the precise
information about the cluster specific correlation is known in advance. In this
article, we propose an optimal design with non-constant ICC across the
clusters. Also in many situations, the cost of sampling of an observation from
a particular cluster may differ from that of some other cluster. An optimal
design in those scenarios is also obtained assuming unequal costs of sampling
from different clusters. The theoretical findings are supplemented by thorough
numerical examples.

Authors: 2

Total Words: 6934

Unqiue Words: 1913

In this paper we analyze the use of subjective logic as a framework for
performing approximate transformations over probability distribution functions.
As for any approximation, we evaluate subjective logic in terms of
computational efficiency and bias. However, while the computational cost may be
easily estimated, the bias of subjective logic operators have not yet been
investigated. In order to evaluate this bias, we propose an experimental
protocol that exploits Monte Carlo simulations and their properties to assess
the distance between the result produced by subjective logic operators and the
true result of the corresponding transformation over probability distribution.
This protocol allows a modeler to get an estimate of the degree of
approximation she must be ready to accept as a trade-off for the computational
efficiency and the interpretability of the subjective logic framework.
Concretely, we apply our method to the relevant case study of the subjective
logic operator for binomial multiplication and we study empirically...

Empirical Evaluation of the Approximation of Binomial Multiplication in Subjective Logic using Monte Carlo Simulations

Authors: 3

Total Words: 10657

Unqiue Words: 1953

The rapid development of modern technology facilitates the appearance of
numerous unprecedented complex data which do not satisfy the axioms of
Euclidean geometry, while most of the statistical hypothesis tests are
available in Euclidean or Hilbert spaces. To properly analyze the data of more
complicated structures, efforts have been made to solve the fundamental test
problems in more general spaces. In this paper, a publicly available R package
Ball is provided to implement Ball statistical test procedures for K-sample
distribution comparison and test of mutual independence in metric spaces, which
extend the test procedures for two sample distribution comparison and test of
independence. The tailormade algorithms as well as engineering techniques are
employed on the Ball package to speed up computation to the best of our
ability. Two real data analyses and several numerical studies have been
performed and the results certify the powerfulness of Ball package in analyzing
complex data, e.g., spherical data and symmetric positive...

Authors: 4

Total Words: 9591

Unqiue Words: 2672

