Training Subset Selection to Improve Prediction Accuracy in Investment Ranking
AbstractMost studies in the supervised learning literature assume that the training set and the test set are generated from the same distribution. If this assumption does not hold, training a model on the whole dataset may significantly reduce prediction accuracy. Here we propose a training instance selection method which constructs a subset of the training set to maximize prediction accuracy. We have applied the proposed algorithm to an investment ranking problem where the training dataset consists of multiple time periods. Our algorithm finds the best group of periods to include in the training set to maximize prediction accuracy for a given target period. By only including similar periods to the training set, prediction performance is significantly improved against a training scheme which uses all of the previous periods to train the model. The proposed algorithm ranked first in the IEEE Investment Ranking Challenge 2018 which was organized as a part of the IEEE Data Science Workshop 2018.
Copyright (c) 2019 International Journal of Intelligent Systems and Applications in Engineering
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.