Learning Optimal Decision Trees from Large Datasets
Inferring a decision tree from a given dataset is one of the classic problems in machine learning. This problem consists of buildings, from a labelled dataset, a tree such that each node corresponds to a class and a path between the tree root and a leaf corresponds to a conjunction of features to be satisfied in this class. Following the principle of parsimony, we want to infer a minimal tree consistent with the dataset. Unfortunately, inferring an optimal decision tree is known to be NP-complete for several definitions of optimality. Hence, the majority of existing approaches relies on heuristics, and as for the few exact inference approaches, they do not work on large data sets. In this paper, we propose a novel approach for inferring a decision tree of a minimum depth based on the incremental generation of Boolean formula. The experimental results indicate that it scales sufficiently well and the time it takes to run grows slowly with the size of dataset.
NurtureToken New!

Token crowdsale for this paper ends in

Buy Nurture Tokens

Author

Are you an author of this paper? Check the Twitter handle we have for you is correct.

Florent Avellaneda (add twitter)
Ask The Authors

Ask the authors of this paper a question or leave a comment.

Read it. Rate it.
#1. Which part of the paper did you read?

#2. The paper contains new data or analyses that is openly accessible?
#3. The conclusion is supported by the data and analyses?
#4. The conclusion is of scientific interest?
#5. The result is likely to lead to future research?

Github
Stargazers:
1
Forks:
1
Open Issues:
0
Network:
1
Subscribers:
2
Language:
Python
Binary Programming Formulation for Learning Classification Trees Using Cplex
Youtube
Link:
None (add)
Views:
0
Likes:
0
Dislikes:
0
Favorites:
0
Comments:
0
Other
Sample Sizes (N=):
Inserted:
Words Total:
Words Unique:
Source:
Abstract:
None
04/14/19 06:01PM
7,253
1,744
Tweets
arxivml: "Learning Optimal Decision Trees from Large Datasets", Florent Avellaneda https://t.co/9NXoM8QutK
arxiv_cs_LG: Learning Optimal Decision Trees from Large Datasets. Florent Avellaneda https://t.co/6JE3b2G11e
Images
Related