Decision trees are ubiquitous in machine learning for their ease of use and
interpretability; however, they are not typically implemented in reinforcement
learning because they cannot be updated via stochastic gradient descent.
Traditional applications of decision trees for reinforcement learning have
focused instead on making commitments to decision boundaries as the tree is
grown one layer at a time. We overcome this critical limitation by allowing for
a gradient update over the entire tree structure that improves sample
complexity when a tree is fuzzy and interpretability when sharp. We offer three
key contributions towards this goal. First, we motivate the need for policy
gradient-based learning by examining the theoretical properties of gradient
descent over differentiable decision trees. Second, we introduce a
regularization framework that yields interpretability via sparsity in the tree
structure. Third, we demonstrate the ability to construct a decision tree via
policy gradient in canonical reinforcement learning domains...

more |
pdf
| html
arxivml:
"Interpretable Reinforcement Learning via Differentiable Decision Trees",
Ivan Dario Jimenez Rodriguez, Taylor Kill…
https://t.co/eJ37FiRnpc

arxiv_cs_LG:
Interpretable Reinforcement Learning via Differentiable Decision Trees. Ivan Dario Jimenez Rodriguez, Taylor Killian, Sung-Hyun Son, and Matthew Gombolay https://t.co/jWqILy64vr

StatsPapers:
Interpretable Reinforcement Learning via Differentiable Decision Trees. https://t.co/iUsh6JQHtL

EricSchles:
RT @Miles_Brundage: "Interpretable Reinforcement Learning via Differentiable Decision Trees," Rodriguez et al.: https://t.co/56JT2hYpUj

tiagoooliveira:
RT @StatsPapers: Interpretable Reinforcement Learning via Differentiable Decision Trees. https://t.co/iUsh6JQHtL

KloudStrife:
RT @Miles_Brundage: "Interpretable Reinforcement Learning via Differentiable Decision Trees," Rodriguez et al.: https://t.co/56JT2hYpUj

jarotter:
RT @StatsPapers: Interpretable Reinforcement Learning via Differentiable Decision Trees. https://t.co/iUsh6JQHtL

PerthMLGroup:
RT @Miles_Brundage: "Interpretable Reinforcement Learning via Differentiable Decision Trees," Rodriguez et al.: https://t.co/56JT2hYpUj

saikrishna_gvs:
RT @Miles_Brundage: "Interpretable Reinforcement Learning via Differentiable Decision Trees," Rodriguez et al.: https://t.co/56JT2hYpUj

AssistedEvolve:
RT @Miles_Brundage: "Interpretable Reinforcement Learning via Differentiable Decision Trees," Rodriguez et al.: https://t.co/56JT2hYpUj

_Artemisa_v:
RT @StatsPapers: Interpretable Reinforcement Learning via Differentiable Decision Trees. https://t.co/iUsh6JQHtL

Stargazers: 1

Subscribers: 1

Subscribers: 1

Forks: 0

Open Issues: 0

Open Issues: 0

None.

Sample Sizes : None.

Authors: 4

Total Words: 7060

Unqiue Words: 2112

Multi-task learning, as it is understood nowadays, consists of using one
single model to carry out several similar tasks. From classifying hand-written
characters of different alphabets to figuring out how to play several Atari
games using reinforcement learning, multi-task models have been able to widen
their performance range across different tasks, although these tasks are
usually of a similar nature. In this work, we attempt to widen this range even
further, by including heterogeneous tasks in a single learning procedure. To do
so, we firstly formally define a multi-network model, identifying the necessary
components and characteristics to allow different adaptations of said model
depending on the tasks it is required to fulfill. Secondly, employing the
formal definition as a starting point, we develop an illustrative model example
consisting of three different tasks (classification, regression and data
sampling). The performance of this model implementation is then analyzed,
showing its capabilities. Motivated by the results...

more |
pdf
| html
arxiv_org:
Towards automatic construction of multi-network models for heterogeneous multi-task learn... https://t.co/HpWeREpZGh https://t.co/5GodeGCjel

ai_research:
We ha just published our preprint on #VALP, a #multinetwork
model for #heterogeneous #multitask #learning. VALP can combine different types of primary #NeuralNetworks and simultaneously solve #classification, #regression and #generation problems. https://t.co/bx2WMVxcFE https://t.co/zSbdvKXH2X

arxivml:
"Towards automatic construction of multi-network models for heterogeneous multi-task learning",
Unai Garciarena, Al…
https://t.co/hpd6lWOPhr

SciFi:
Towards automatic construction of multi-network models for heterogeneous multi-task learning. https://t.co/4gh0m0KVTq

ElectronNest:
RT @arxiv_org: Towards automatic construction of multi-network models for heterogeneous multi-task learn... https://t.co/HpWeREpZGh https:/…

Stargazers: 0

Subscribers: 1

Subscribers: 1

Forks: 0

Open Issues: 0

Open Issues: 0

None.

Sample Sizes : None.

Authors: 3

Total Words: 11460

Unqiue Words: 3061

One problem in the application of reinforcement learning to real-world
problems is the curse of dimensionality on the action space. Macro actions, a
sequence of primitive actions, have been studied to diminish the dimensionality
of the action space with regard to the time axis. However, previous studies
relied on humans defining macro actions or assumed macro actions as repetitions
of the same primitive actions. We present Factorized Macro Action Reinforcement
Learning (FaMARL) which autonomously learns disentangled factor representation
of a sequence of actions to generate macro actions that can be directly applied
to general reinforcement learning algorithms. FaMARL exhibits higher scores
than other reinforcement learning algorithms on environments that require an
extensive amount of search.

more |
pdf
| html
arxiv_in_review:
#IJCAI19 Macro Action Reinforcement Learning with Sequence Disentanglement using Variational Autoencoder. (arXiv:1903.09366v1 [cs\.LG]) https://t.co/okz2BhuV83

arxivml:
"Macro Action Reinforcement Learning with Sequence Disentanglement using Variational Autoencoder",
Kim Heecheol, Ma…
https://t.co/Yyp4CRI7jf

SciFi:
Macro Action Reinforcement Learning with Sequence Disentanglement using Variational Autoencoder. https://t.co/M9ReGK6Jpr

arxiv_cs_LG:
Macro Action Reinforcement Learning with Sequence Disentanglement using Variational Autoencoder. Kim Heecheol, Masanori Yamada, Kosuke Miyoshi, and Hiroshi Yamakawa https://t.co/KtGjiVwGQU

None.

None.

Sample Sizes : None.

Authors: 4

Total Words: 4699

Unqiue Words: 1564

This paper presents Acquisition Thompson Sampling (ATS), a novel algorithm
for batch Bayesian Optimization (BO) based on the idea of sampling multiple
acquisition functions from a stochastic process. We define this process through
the dependency of the acquisition functions on a set of model parameters. ATS
is conceptually simple, straightforward to implement and, unlike other batch BO
methods, it can be employed to parallelize any sequential acquisition function.
In order to improve performance for multi-modal tasks, we show that ATS can be
combined with existing techniques in order to realize different explore-exploit
trade-offs and take into account pending function evaluations. We present
experiments on a variety of benchmark functions and on the hyper-parameter
optimization of a popular gradient boosting tree algorithm. These demonstrate
the competitiveness of our algorithm with two state-of-the-art batch BO
methods, and its advantages to classical parallel Thompson Sampling for BO.

more |
pdf
| html
None.

arxiv_in_review:
#ICML2019 Sampling Acquisition Functions for Batch Bayesian Optimization. (arXiv:1903.09434v1 [cs\.LG]) https://t.co/iv2yZ2c2k1

arxivml:
"Sampling Acquisition Functions for Batch Bayesian Optimization",
Alessandro De Palma, Celestine Mendler-Dünner, Th…
https://t.co/rMncB29bP0

arxiv_cs_LG:
Sampling Acquisition Functions for Batch Bayesian Optimization. Alessandro De Palma, Celestine Mendler-Dünner, Thomas Parnell, Andreea Anghel, and Haralampos Pozidis https://t.co/cviragGxNR

StatsPapers:
Sampling Acquisition Functions for Batch Bayesian Optimization. https://t.co/mjJ56utzhl

None.

None.

Sample Sizes : None.

Authors: 5

Total Words: 6972

Unqiue Words: 2115

The recommender system is an important form of intelligent application, which
assists users to alleviate from information redundancy. Among the metrics used
to evaluate a recommender system, the metric of conversion has become more and
more important. The majority of existing recommender systems perform poorly on
the metric of conversion due to its extremely sparse feedback signal. To tackle
this challenge, we propose a deep hierarchical reinforcement learning based
recommendation framework, which consists of two components, i.e., high-level
agent and low-level agent. The high-level agent catches long-term sparse
conversion signals, and automatically sets abstract goals for low-level agent,
while the low-level agent follows the abstract goals and interacts with
real-time environment. To solve the inherent problem in hierarchical
reinforcement learning, we propose a novel deep hierarchical reinforcement
learning algorithm via multi-goals abstraction (HRL-MG). Our proposed algorithm
contains three characteristics: 1) the high-level...

more |
pdf
| html
None.

arxivml:
"Deep Hierarchical Reinforcement Learning Based Recommendations via Multi-goals Abstraction",
Dongyang Zhao, Liang …
https://t.co/0vmGQslDec

SciFi:
Deep Hierarchical Reinforcement Learning Based Recommendations via Multi-goals Abstraction. https://t.co/Qsld3jf2Rt

arxiv_cs_LG:
Deep Hierarchical Reinforcement Learning Based Recommendations via Multi-goals Abstraction. Dongyang Zhao, Liang Zhang, Bo Zhang, Lizhou Zheng, Yongjun Bao, and Weipeng Yan https://t.co/OpmvsNIwB6

AdaptToReality:
RT @SciFi: Deep Hierarchical Reinforcement Learning Based Recommendations via Multi-goals Abstraction. https://t.co/Qsld3jf2Rt

None.

None.

Sample Sizes : None.

Authors: 6

Total Words: 10084

Unqiue Words: 2549

We propose Deep Q-Networks (DQN) with model-based exploration, an algorithm
combining both model-free and model-based approaches that explores better and
learns environments with sparse rewards more efficiently. DQN is a
general-purpose, model-free algorithm and has been proven to perform well in a
variety of tasks including Atari 2600 games since it's first proposed by Minh
et el. However, like many other reinforcement learning (RL) algorithms, DQN
suffers from poor sample efficiency when rewards are sparse in an environment.
As a result, most of the transitions stored in the replay memory have no
informative reward signal, and provide limited value to the convergence and
training of the Q-Network. However, one insight is that these transitions can
be used to learn the dynamics of the environment as a supervised learning
problem. The transitions also provide information of the distribution of
visited states. Our algorithm utilizes these two observations to perform a
one-step planning during exploration to pick an action that...

more |
pdf
| html
arxivml:
"DQN with model-based exploration: efficient learning on environments with sparse rewards",
Stephen Zhen Gou, Yuyan…
https://t.co/dljHZ7B6dE

arxiv_cs_LG:
DQN with model-based exploration: efficient learning on environments with sparse rewards. Stephen Zhen Gou and Yuyang Liu https://t.co/bnv1pPPMCd

StatsPapers:
DQN with model-based exploration: efficient learning on environments with sparse rewards. https://t.co/oGufQO8MsA

jd_mashiro:
RT @StatsPapers: DQN with model-based exploration: efficient learning on environments with sparse rewards. https://t.co/oGufQO8MsA

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 3615

Unqiue Words: 1274

Many machine learning systems make extensive use of large amounts of data
regarding human behaviors. Several researchers have found various
discriminatory practices related to the use of human-related machine learning
systems, for example in the field of criminal justice, credit scoring and
advertising. Fair machine learning is therefore emerging as a new field of
study to mitigate biases that are inadvertently incorporated into algorithms.
Data scientists and computer engineers are making various efforts to provide
definitions of fairness. In this paper, we provide an overview of the most
widespread definitions of fairness in the field of machine learning, arguing
that the ideas highlighting each formalization are closely related to different
ideas of justice and to different interpretations of democracy embedded in our
culture. This work intends to analyze the definitions of fairness that have
been proposed to date to interpret the underlying criteria and to relate them
to different ideas of democracy.

more |
pdf
| html
arxivml:
"The invisible power of fairness． How machine learning shapes democracy",
Elena Beretta, Antonio Santangelo, Bruno …
https://t.co/Nqx6oOkUBh

arxiv_cs_LG:
The invisible power of fairness. How machine learning shapes democracy. Elena Beretta, Antonio Santangelo, Bruno Lepri, Antonio Vetrò, and Juan Carlos De Martin https://t.co/OxYxWsxPlX

Memoirs:
The invisible power of fairness. How machine learning shapes democracy. https://t.co/zj6Z4auLt5

insurrealist:
RT @Memoirs: The invisible power of fairness. How machine learning shapes democracy. https://t.co/zj6Z4auLt5

None.

None.

Sample Sizes : None.

Authors: 5

Total Words: 5735

Unqiue Words: 1829

Machine learning offers remarkable benefits for improving workplaces and
working conditions amongst others in the recycling industry. Here e.g.
hand-sorting of medium value scrap is labor intensive and requires experienced
and skilled workers. On the one hand, they have to be highly concentrated for
making proper readings and analyses of the material, but on the other hand,
this work is monotonous. Therefore, a machine learning approach is proposed for
a quick and reliable automated identification of alloys in the recycling
industry, while the mere scrap handling is regarded to be left in the hands of
the workers. To this end, a set of twelve tool and high-speed steels from the
field were selected to be identified by their spectrum induced by electric
arcs. For data acquisition, the optical emission spectrometer Thorlabs CCS 100
was used. Spectra have been post-processed to be fed into the supervised
machine learning algorithm. The development of the machine learning software is
conducted according to the steps of the VDI 2221...

more |
pdf
| html
arxivml:
"Artificial intelligence-based process for metal scrap sorting",
Maximilian Auer, Kai Osswald, Raphael Volz, Joerg …
https://t.co/GtOJ8Zv2O2

SciFi:
Artificial intelligence-based process for metal scrap sorting. https://t.co/aSvIlUnA2g

arxiv_cs_LG:
Artificial intelligence-based process for metal scrap sorting. Maximilian Auer, Kai Osswald, Raphael Volz, and Joerg Woidasky https://t.co/ga4FqxCFMy

None.

None.

Sample Sizes : None.

Authors: 4

Total Words: 3440

Unqiue Words: 1236

Wide-band Electromagnetic Induction Sensors (WEMI) have been used for a
number of years in subsurface detection of explosive hazards. While WEMI
sensors have proven effective at localizing objects exhibiting large magnetic
responses, detecting objects lacking or containing very low amounts of
conductive materials can be challenging. In this paper, we compare a number of
target detection algorithms in the literature in terms of detection
performance. In the comparison, methods are tested on two real-world data sets:
one containing relatively low amounts of ground noise pollution, and the other
demonstrating highly-magnetic soil interference. Results are quantitatively
evaluated through receiver-operator characteristic (ROC) curves and are used to
highlight the strengths and weaknesses of the compared approaches in hand-held
explosive hazard detection.

more |
pdf
| html
arxivml:
"Comparison of Hand-held WEMI Target Detection Algorithms",
Connor H． McCurley, James Bocinsky, Alina Zare
https://t.co/xMtl3ZP7uS

arxiv_cs_LG:
Comparison of Hand-held WEMI Target Detection Algorithms. Connor H. McCurley, James Bocinsky, and Alina Zare https://t.co/GW7dmFh4Hh

Memoirs:
Comparison of Hand-held WEMI Target Detection Algorithms. https://t.co/pdiL4v4hz3

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 8386

Unqiue Words: 2400

For autonomous agents to successfully operate in real world, the ability to
anticipate future motions of surrounding entities in the scene can greatly
enhance their safety levels since potentially dangerous situations could be
avoided in advance. While impressive results have been shown on predicting each
agent's behavior independently, we argue that it is not valid to consider road
entities individually since transitions of vehicle states are highly coupled.
Moreover, as the predicted horizon becomes longer, modeling prediction
uncertainties and multi-modal distributions over future sequences will turn
into a more challenging task. In this paper, we address this challenge by
presenting a multi-modal probabilistic prediction approach. The proposed method
is based on a generative model and is capable of jointly predicting sequential
motions of each pair of interacting agents. Most importantly, our model is
interpretable, which can explain the underneath logic as well as obtain more
reliability to use in real applications. A...

more |
pdf
| html
None.

arxivml:
"Multi-modal Probabilistic Prediction of Interactive Behavior via an Interpretable Model",
Yeping Hu, Wei Zhan, Mas…
https://t.co/v27aqPqAAH

arxiv_cs_LG:
Multi-modal Probabilistic Prediction of Interactive Behavior via an Interpretable Model. Yeping Hu, Wei Zhan, and Masayoshi Tomizuka https://t.co/r17n9ccIko

Memoirs:
Multi-modal Probabilistic Prediction of Interactive Behavior via an Interpretable Model. https://t.co/t5hSKFpDbq

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 5312

Unqiue Words: 1789

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

*Tracking 100,376 papers.*

Sort results based on if they are interesting or reproducible.

Interesting

Reproducible