Combining big data and machine learning algorithms, the power of automatic
decision tools induces as much hope as fear. Many recently enacted European
legislation (GDPR) and French laws attempt to regulate the use of these tools.
Leaving aside the well-identified problems of data confidentiality and
impediments to competition, we focus on the risks of discrimination, the
problems of transparency and the quality of algorithmic decisions. The detailed
perspective of the legal texts, faced with the complexity and opacity of the
learning algorithms, reveals the need for important technological disruptions
for the detection or reduction of the discrimination risk, and for addressing
the right to obtain an explanation of the auto- matic decision. Since trust of
the developers and above all of the users (citizens, litigants, customers) is
essential, algorithms exploiting personal data must be deployed in a strict
ethical framework. In conclusion, to answer this need, we list some ways of
controls to be developed: institutional control,...

more |
pdf
| html
Sample Sizes : None.

Authors: 4

Total Words: 12229

Unqiue Words: 3833

To achieve scientific progress in terms of building a cumulative body of
knowledge, careful attention to benchmarking is of the utmost importance. This
means that proposals of new methods of data pre-processing, new data-analytic
techniques, and new methods of output post-processing, should be extensively
and carefully compared with existing alternatives, and that existing methods
should be subjected to neutral comparison studies. To date, benchmarking and
recommendations for benchmarking have been frequently seen in the context of
supervised learning. Unfortunately, there has been a dearth of guidelines for
benchmarking in an unsupervised setting, with the area of clustering as an
important subdomain. To address this problem, discussion is given to the
theoretical conceptual underpinnings of benchmarking in the field of cluster
analysis by means of simulated as well as empirical data. Subsequently, the
practicalities of how to address benchmarking questions in clustering are dealt
with, and foundational recommendations are made.

more |
pdf
| html
Sample Sizes : None.

Authors: 8

Total Words: 10895

Unqiue Words: 2753

Null hypothesis significance testing remains popular despite decades of
concern about misuse and misinterpretation. We believe that much of the problem
is due to language: significance testing has little to do with other meanings
of the word "significance". Despite the limitations of null-hypothesis tests,
we argue here that they remain useful in many contexts as a guide to whether a
certain effect can be seen clearly in that context (e.g. whether we can clearly
see that a correlation or between-group difference is positive or negative). We
therefore suggest that researchers describe the conclusions of null-hypothesis
tests in terms of statistical "clarity" rather than statistical "significance".
This simple semantic change could substantially enhance clarity in statistical
communication.

more |
pdf
| html
I can see clearly now: reinterpreting statistical significance
suggestion: “researchers describe the conclusions of null-hypothesis tests in terms of statistical "clarity" rather than statistical "significance"”
Interesting paper! @jd_mathbio, @MPKain + @bolkerb argue for using "statistically clear" over "statistically significant" to describe the results of null hypothesis testing.
Sample Sizes : None.

Authors: 3

Total Words: 2529

Unqiue Words: 1132

We prove that intersections and unions of independent random sets in finite
spaces achieve a form of Lipschitz continuity. More precisely, given the
distribution of a random set $\Xi$, the function mapping any random set
distribution to the distribution of its intersection (under independence
assumption) with $\Xi$ is Lipschitz continuous with unit Lipschitz constant if
the space of random set distributions is endowed with a metric defined as the
$L_k$ norm distance between inclusion functionals also known as commonalities.
Moreover, the function mapping any random set distribution to the distribution
of its union (under independence assumption) with $\Xi$ is Lipschitz continuous
with unit Lipschitz constant if the space of random set distributions is
endowed with a metric defined as the $L_k$ norm distance between hitting
functionals also known as plausibilities.
Using the epistemic random set interpretation of belief functions, we also
discuss the ability of these distances to yield conflict measures. All the
proofs in this...

more |
pdf
| html
Sample Sizes : None.

Authors: 1

Total Words: 11062

Unqiue Words: 2306

Chen and Risen pointed out a logical flaw affecting the conclusions of a
number of past experiments that used the free-choice paradigm to measure
choice-induced attitude change. They went on to design and implement a
free-choice experiment that used a novel type of control group in order to
avoid this logical pitfall. In this paper, we describe a method by which a
free-choice experiment can be correctly conducted even without a control group.

more |
pdf
| html
Sample Sizes : None.

Authors: 2

Total Words: 7463

Unqiue Words: 2027

In the context of industrial engineering, standby allocation strategy is
usually adopted by engineers to improve the lifetimes of coherent systems. This
paper investigates the optimal allocation strategies of standby redundancies
for coherent systems comprised of dependent components having left tail weakly
stochastic arrangement increasing or right tail weakly stochastic arrangement
increasing lifetimes. For the case of independent matched heterogeneous standby
redundancies, it is proved that the better redundancy should be put in the node
with weaker[better] component in a series[parallel] system. For the case of
independent homogeneous standby redundancies, it is shown that more
redundancies should be put in standby with weaker[better] component to improve
the lifetime of a series[parallel] system. The results developed here
generalize and extend those related ones in the literature to the case of
dependent components. Numerical examples are presented to provide guidances for
practical use of our theoretical findings....

more |
pdf
| html
Sample Sizes : None.

Authors: 1

Total Words: 9699

Unqiue Words: 1826

New tools have made it much easier for students to develop skills to work
with interesting data sets as they begin to extract meaning from data. To fully
appreciate the statistical analysis cycle, students benefit from repeated
experiences collecting, ingesting, wrangling, analyzing data and communicating
results. How can we bring such opportunities into the classroom? We describe a
classroom activity, originally developed by Danny Kaplan (Macalester College),
in which students can expand upon statistical problem solving by hand-scraping
data from cars.com, ingesting these data into R, then carrying out analyses of
the relationships between price, mileage, and model year for a selected type of
car.

more |
pdf
| html
Cars.com scraping and multivariate analysis CAUSE activity webinar

Stargazers: 0

Subscribers: 3

Subscribers: 3

Forks: 0

Open Issues: 0

Open Issues: 0

Sample Sizes : None.

Authors: 2

Total Words: 2514

Unqiue Words: 1142

Jon August Wellner was born in Portland, Oregon, in August 1945. He received
his Bachelor's degree from the University of Idaho in 1968 and his PhD degree
from the University of Washington in 1975. From 1975 until 1983 he was an
Assistant Professor and Associate Professor at the University of Rochester. In
1983 he returned to the University of Washington, and has remained at the UW as
a faculty member since that time. Over the course of a long and distinguished
career, Jon has made seminal contributions to a variety of areas including
empirical processes, semiparametric theory, and shape-constrained inference,
and has co-authored a number of extremely influential books. He has been
honored as the Le Cam lecturer by both the IMS (2015) and the French
Statistical Society (2017). He is a Fellow of the IMS, the ASA, and the AAAS,
and an elected member of the International Statistical Institute. He has served
as co-Editor of Annals of Statistics (2001--2003) and Editor of Statistical
and President of IMS

more |
pdf
| html
Sample Sizes : None.

Authors: 2

Total Words: 10575

Unqiue Words: 3009

We describe a contest in variable selection which was part of a statistics
course for graduate students. In particular, the possibility to create a
contest themselves offered an additional challenge for more advanced students.
Since working with data is becoming more important in teaching statistics, we
greatly encourage other instructors to try the same.

more |
pdf
| html
Sample Sizes : None.

Authors: 3

Total Words: 3264

Unqiue Words: 1302

This article, produced as a result of the Symposium on Statistical Inference,
is an introduction to the literature on the function of expertise, judgment,
and choice in the practice of statistics and scientific research. In
particular, expert judgment plays a critical role in conducting Frequentist
hypothesis tests and Bayesian models, especially in selection of appropriate
prior distributions for model parameters. The subtlety of interpreting results
is also discussed. Finally, external recommendations are collected for how to
more effectively encourage proper use of judgment in statistics. The paper
synthesizes the literature for the purpose of creating a single reference and
inciting more productive discussions on how to improve the future of statistics
and science.

more |
pdf
| html
Sample Sizes : None.

Authors: 1

Total Words: 12020

Unqiue Words: 3845

