In this paper, we predict severity of extreme weather events (tropical
storms, hurricanes, etc.) using buoy data time series variables such as wind
speed and air temperature. The prediction/forecasting method is based on
various forecasting and machine learning models. The following steps are used.
Data sources for the buoys and weather events are identified, aggregated and
merged. For missing data imputation, we use Kalman filters as well as splines
for multivariate time series. Then, statistical tests are run to ascertain
increasing trends in weather event severity. Next, we use machine learning to
predict/forecast event severity using buoy variables, and report good
accuracies for the models built.

more |
pdf
| html
None.

arxivml:
"Weather event severity prediction using buoy data and machine learning",
Vikas Ramachandra
https://t.co/tY6nN2VUtS

StatsPapers:
Weather event severity prediction using buoy data and machine learning. https://t.co/XTekMnlHgR

drahmadbazzi:
RT @arxivml: "Weather event severity prediction using buoy data and machine learning",
Vikas Ramachandra
https://t.co/tY6nN2VUtS

None.

None.

Sample Sizes : None.

Authors: 1

Total Words: 0

Unqiue Words: 0

The observing system uncertainty experiments (OSUEs) have been widely used as
a cost-effective way to make retrieval quality assessment in NASA's Orbiting
Carbon Observatory-2 (OCO-2) mission. One important component in the OCO-2
retrieval algorithm is a full-physics forward model that describes the
relationship between the atmospheric variables such as carbon dioxide and
radiances measured by the remote sensing instrument. This forward model is
complicated and computationally expensive but a large-scale OSUE requires
evaluation of this model numerous times, which makes it infeasible for
operational usage. To tackle this issue, we develop a statistical emulator to
facilitate efficient large-scale OSUEs in remote sensing. This emulator
represents radiances output at irregular wavelengths via a linear combination
of basis functions and random coefficients. These random coefficients are then
modeled with a nearest-neighbor Gaussian process with built-in input dimension
reduction via active subspace. The proposed emulator reduces...

more |
pdf
| html
None.

StatsPapers:
Computer Model Emulation with High-Dimensional Functional Output in Large-Scale Observing System Uncertainty Experiments. https://t.co/SvF1NI9P4p

None.

None.

Sample Sizes : None.

Authors: 6

Total Words: 0

Unqiue Words: 0

Although recent research on social networks emphasizes microscopic dynamics
such as retweets and social connectivity of an individual user, we focus on
macroscopic growth dynamics of social network link formation. Rather than
focusing on one particular dataset, we find invariant behavior in regional
social networks that are geographically concentrated. Empirical findings
suggest that the startup phase of a regional network can be modeled by a
self-exciting point process. After the startup phase ends, the growth of the
links can be modeled by a non-homogeneous Poisson process with constant rate
across the day but varying rates from day to day, plus a nightly inactive
period when local users are expected to be asleep. Conclusions are drawn based
on analyzing four different datasets, three of which are regional and a
non-regional one is included for contrast.

more |
pdf
| html
None.

arxiv_org:
Common Growth Patterns for Regional Social Networks: a Point Process Approach. https://t.co/A6hedJjulj https://t.co/qzirowQSgt

StatsPapers:
Common Growth Patterns for Regional Social Networks: a Point Process Approach. https://t.co/azD3NyfCru

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 0

Unqiue Words: 0

Three important issues are often encountered in Supervised and
Semi-Supervised Classification: class-memberships are unreliable for some
training units (label noise), a proportion of observations might depart from
the main structure of the data (outliers) and new groups in the test set may
have not been encountered earlier in the learning phase (unobserved classes).
The present work introduces a robust and adaptive Discriminant Analysis rule,
capable of handling situations in which one or more of the afore-mentioned
problems occur. Two EM-based classifiers are proposed: the first one that
jointly exploits the training and test sets (transductive approach), and the
second one that expands the parameter estimate using the test set, to complete
the group structure learned from the training set (inductive approach).
Experiments on synthetic and real data, artificially adulterated, are provided
to underline the benefits of the proposed method.

more |
pdf
| html
None.

StatsPapers:
Anomaly and Novelty detection for robust semi-supervised learning. https://t.co/Z8pQ1dcELn

Model-based framework for robust classification that jointly accounts for outliers, label noise and unobserved classes in the test set.

Stargazers: 2

Subscribers: 0

Subscribers: 0

Forks: 0

Open Issues: 0

Open Issues: 0

None.

Sample Sizes : None.

Authors: 3

Total Words: 14719

Unqiue Words: 3081

Advertising experiments often suffer from noisy responses making precise
estimation of the average treatment effect (ATE) and evaluating ROI difficult.
We develop a principal stratification model that improves the precision of the
ATE by dividing the customers into three strata - those who buy regardless of
ad exposure, those who buy only if exposed to ads and those who do not buy
regardless. The method decreases the variance of the ATE by separating out the
typically large share of customers who never buy and therefore have individual
treatment effects that are exactly zero. Applying the procedure to 5 catalog
mailing experiments with sample sizes around 140,000 shows a reduction of
36-57% in the variance of the estimate. When we include pre-randomization
covariates that predict stratum membership, we find that estimates of
customers' past response to similar advertising are a good predictor of stratum
membership, even if such estimates are biased because past advertising was
targeted. Customers who have not purchased recently...

more |
pdf
| html
None.

StatsPapers:
Principal Stratification for Advertising Experiments. https://t.co/4cqYun5QLF

dizzy_my_future:
RT @StatsPapers: Principal Stratification for Advertising Experiments. https://t.co/4cqYun5QLF

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 11461

Unqiue Words: 3029

Although academic research on the 'hot hand' effect (in particular, in
sports, especially in basketball) has been going on for more than 30 years, it
still remains a central question in different areas of research whether such an
effect exists. In this contribution, we investigate the potential occurrence of
a 'hot shoe' effect for the performance of penalty takers in football based on
data from the German Bundesliga. For this purpose, we consider hidden Markov
models (HMMs) to model the (latent) forms of players. To further account for
individual heterogeneity of the penalty taker as well as the opponent's
goalkeeper, player-specific abilities are incorporated in the model formulation
together with a LASSO penalty. Our results suggest states which can be tied to
different forms of players, thus providing evidence for the hot shoe effect,
and shed some light on exceptionally well-performing goalkeepers, which are of
potential interest to managers and sports fans.

more |
pdf
| html
None.

StatsPapers:
A regularized hidden Markov model for analyzing the 'hot shoe' in football. https://t.co/DDgMfrNrhH

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 6020

Unqiue Words: 1760

Ride-sourcing or transportation network companies (TNCs) provide on-demand
transportation service for compensation, connecting drivers of personal
vehicles with passengers through the use of smartphone applications. This
article considers the problem of estimating the probability distribution of the
productivity of a driver as a function of space and time. We study data
consisting of more than 1 million ride-sourcing trips in Austin, Texas, which
are scattered throughout a large graph of 223k vertices, where each vertex
represents a traffic analysis zone (TAZ) at a specific hour of the week. We
extend existing methods for spatial density smoothing on very large general
graphs to the spatiotemporal setting. Our proposed model allows for distinct
spatial and temporal dynamics, including different degrees of smoothness, and
it appropriately handles vertices with missing data, which in our case arise
from a fine discretization over the time dimension. Core to our method is an
extension of the Graph-Fused Lasso that we refer to as the...

more |
pdf
| html
None.

StatsPapers:
Large-Scale Spatiotemporal Density Smoothing with the Graph-fused Elastic Net: Application to Ride-sourcing Driver Productivity Analysis. https://t.co/GmfkAAglMW

None.

None.

Sample Sizes : None.

Authors: 4

Total Words: 0

Unqiue Words: 0

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

*Tracking 226,496 papers.*

Sort results based on if they are interesting or reproducible.

Interesting

Reproducible