Training Subset Selection to Improve Prediction Accuracy in Investment Ranking

Keywords: Supervised learning, data science, instance selection, investment ranking

Abstract

Most studies in the supervised learning literature assume that the training set and the test set are generated from the same distribution. If this assumption does not hold, training a model on the whole dataset may significantly reduce prediction accuracy. Here we propose a training instance selection method which constructs a subset of the training set to maximize prediction accuracy. We have applied the proposed algorithm to an investment ranking problem where the training dataset consists of multiple time periods. Our algorithm finds the best group of periods to include in the training set to maximize prediction accuracy for a given target period. By only including similar periods to the training set, prediction performance is significantly improved against a training scheme which uses all of the previous periods to train the model. The proposed algorithm ranked first in the IEEE Investment Ranking Challenge 2018 which was organized as a part of the IEEE Data Science Workshop 2018.

Downloads

Download data is not yet available.
Published
2019-03-20
How to Cite
[1]
M. Koseoglu, “Training Subset Selection to Improve Prediction Accuracy in Investment Ranking”, IJISAE, vol. 7, no. 1, pp. 42-46, Mar. 2019.
Section
Research Article