An Active Learning Based Emoji Prediction Method in Turkish
DOI:
https://doi.org/10.18201/ijisae.2020158882Keywords:
Active Learning, Emoji Prediction, Turkish Emoji DatasetAbstract
Emoji usage has become a standard in social media platforms since it can condense feelings beyond short textual information. Recent advances in machine learning enable to write short messages with automatically detected emojis. However, the prediction of emojis for the given short message can be complicated, inasmuch as users can interpret different meanings beyond the intent of their designers. Therefore, an automatic extraction strategy of training samples cannot be convenient from the large volumes of unlabeled tweets. In this paper, we present an active learning method to evaluate the emoji prediction of a tweet with a limited number of labeled Turkish emoji dataset. To simulate a human-machine collaborative learning method, we train an initial classifier with this dataset and then we update the classifier by filtering related samples out from the large pool of unlabeled data. In the evaluation, we split 25\% randomly selected tweets combined with only one emoji from the generated dataset as a test case. Our active learning method has achieved 0.901 F1 score and outperforms other baseline supervised learning methods.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.