Pronoun Resolution for Tweets in Turkish

Authors

Keywords:

Pronoun Resolution, Turkish Tweets, Machine Learning, Learning Curve, Social Media

Abstract

This paper aims to provide an analysis of anaphoric relations in tweets in Turkish language. The analysis offered rests on the results of a sequence of experiments conducted using a group of machine learning algorithms. The algorithms used in this study are J48, Voted Perceptron, SVM (support vector machine), Naive Bayes and k-nearest neighbours. These classifiers were experimented by parametric variations are scrutinized to elaborate on the problem of matching a model conveniently to the task available already. Another important contribution of the paper is the comparison offered between two genres of texts, namely tweets versus child stories. Our experimental results are compared with those of the previous work and, thereby, a comparison is offered between the anaphoric structure of tweets and that of child stories in Turkish.

Downloads

Download data is not yet available.

References

Mitkov R. “Evaluating anaphora resolution approaches,” In: Proceedings of the Discourse Anaphora and Anaphora Resolution Colloquium (DAARC’2), Lancaster, UK, 1-4 August 1998, pp. 164-172.

Mitkov R. “Anaphora Resolution: The State of the Art,” Technical Report, University of Wolverhampton, UK, 1999.

Kornfilt J. “Turkish,” New York: Routledge, 1997.

Yıldırım S and Kılıçaslan Y. “A machine learning approach to personal pronoun resolution in Turkish,” In: Proceedings of 20th International FLAIRS Conference, FLAIRS-20, Key West, Florida, USA, 7-9 May 2007, pp. 269-270.

Kılıçaslan Y et al. “Learning-based pronoun resolution for Turkish with a comparative evaluation,” Computer Speech and Language 2009; 23(3): 311-331.

Quasim I et al. “Concept map construction from text documents using affinity propagation,” Journal of Information Science 2013; 39(6): 719-736.

Lubani M et al. “Ontology population: Approaches and design aspects,” Journal of Information Science 2018; 45(4): 502-515.

Hoste V and Daelemans W. “Comparing learning approaches to coreference resolution: There is more to it than ’bias’,” In: Proceedings of the ICML-2005 Workshop on Meta-Learning, Bonn, Germany, 2005, pp. 20–27.

Bickel B. “Referential density in discourse and syntactic typology,” Language 2003; 79: 708–736.

Cunnings I et al. “Anaphora resolution and reanalysis during L2 sentence processing: evidence from the visual world paradigm,” Studies in Second Language Acquisition 2017; 39(4): 621-652.

Stonic U et al. “Discourse and logical form: pronouns, attention and coherence,” Linguistics and Philosophy 2017; 40(5): 519–547.

Dhole K and Kohli H. “Document Categorization Using Semantic Relatedness & Anaphora Resolution: A Discussion,” In: Proceedings of IEEE International Conference on Research in Computational Intelligence and Communication Networks, Kolkata, West Bengal, India, 20-22 November 2015, pp. 439-443.

Othman R et al. “Towards Using Public Conversations to Mine Product Features in Twitter,” In: Proceedings of IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA), Hammamet, Tunisia, 2017, pp. 966-972.

Erguvanlı T. E. “Pronominal vs. zero representation of anaphora in Turkish,” In: Slobin, D. I. and Zimmer, K (eds) Studies in Turkish Linguistics. Amsterdam, Holland, 1986, pp. 206–233.

Turan U. D. “Null vs. Overt Subjects in Turkish Discourse: A Centering Analysis,” Ph.D. Thesis, University of Pennsylvania, USA, 1996.

Grosz B et al. “Providing a unified account of definite noun phrases in discourse,” In: Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics, Cambridge, Massachusetts, USA, 15-17 June 1983, pp. 44-50.

Grosz B and Ziv Y. “Centering, global focus, and right dislocation,” Harvard University, USA, 1998, pp. 1-14.

Grosz B et al. “Centering: a framework for modeling the local coherence of discourse,” Computational Linguistics 1995; 2(2): 203–225.

Gracanin-Yüksek M et al. “The Interaction of Contextual and Syntactic Information in the Processing of Turkish Anaphors,” Journal of Psycholinguist Res 2017; 46(6): 1397–1425.

Kılıçaslan Y. “Syntax of information structure in Turkish,” Linguistics 2004; 42(4): 717–765.

Banfield A. “Unspeakable Sentences: Narration and Representation in the Language of Fiction,” London: Routledge & Kegan Paul, 1982.

Tabrizi A. A. et al. “A Rule-Based Approach for Pronoun Extraction and Pronoun Mapping in Pronominal Anaphora Resolution of Quran English Translations,” Malay. Journ. of Comp. Sci. 2016; 29(3): 207-224.

Singla D and Kumar P. “Rule Based Anaphora Resolution in Hindi,” In: Proceedings of IEEE International Conference on Computational Intelligence in Data Science (ICCIDS), Chennai, India, 2-3 June 2017, pp: 1-5.

Kawasaki T and Kimura M. “A Novel Japanese Anaphora Resolution Method Using Deep Cases,” In: Proceedings of IEEE International Symposium on Computer Science and Intelligent Controls, Budapest, 2017, pp. 129-134.

McCarthy J. F. and Lehnert W. G. “Using decision trees for coreference resolution,” In: Proceedings of International Joint Conference on AI, Palais de Congres Montreal, Quebec, Canada, 1995, pp. 1050–1055.

Aone C and Bennet S. W. “Evaluating automated and manual acquisition of anaphora resolution rules,” In: Proceedings of 33rd Meeting of the Association for Computational Linguistics, USA, 1995, pp. 122–129.

Yang X et al. “Coreference resolution using competition learning approach,” In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, 7-12 July 2003, pp. 176–183.

McCarthy J. A. “Trainable Approach to Coreference Resolution for Information Extraction,” Ph.D. Thesis, Dept. of Comp. Sci., Univ. of Massachusetts, USA, 1996.

Cardie C and Wagstaff K. “Noun phrase coreference as clustering,” In: Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. University of Maryland, College Park, MD, USA, 21-22 June 1999, pp. 82–89.

Ng V and Cardie C. “Improving machine learning approaches to coreference resolution,” In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Univ. of Pennsylvania, Philadelphia, USA, 7-12 July 2002, pp. 104–111.

Trouilleux F. “A rule-based pronoun resolution system for French,” In: Proceedings of the Fourth Discourse Anaphora and Anaphora Resolution Colloquium (DAARC’02). Lisbon, Portugal, 18-20 September 2002.

Witten I. H. and Frank E. “Data Mining: Practical Machine Learning Tools and Techniques,” 2nd ed. San Francisco: Morgan Kaufmann, 2005.

Weka 3: Data Mining Software in Java. Machine Learning Group at the University of Waikato. http://www.cs.waikato.ac.nz/ml/weka/, 2018.

Quinlan R. J. “C4.5: Programs for Machine Learning,” San Mateo, CA: Morgan Kaufmann Publishers, 1993.

John G. H. and Langley P. “Estimating Continuous Distributions in Bayesian Classifiers,” In: Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence, San Mateo, CA, USA, 1995, pp. 338-345.

Aha D. W. et al. “Instance-based learning algorithms,” Machine Learning 1991; 6(1): 37-66.

Freund Y and Schapire R. E. “Large margin classification using the Perceptron algorithm,” Machine Learning 1999; 37(3): 277-296.

Chang C. C. and Lin C. J. “LIBSVM – A Library for Support Vector Machines,” http://www. csie.ntu.edu.tw/ ˜cjlin/libsvm, 2004.

Cohen J. A. “Coefficient of agreement for nominal scales,” Educational and Psychological Measurement 1960; 20(1): 37-46.

Landis J. R. and Koch G. G. “The measurement of observer agreement for categorical data,” Biometrics, 1977; 33(1): 159–174.

Ho T. K. et al. “Measures of Geometrical Complexity in Classification Problems,” In: Basu M and Ho T. K. (eds) Data Complexity in Pattern Recognition. London: Springer-Verlag, 2006.

Gračanin-Yuksek, Martina, et al. "The interpretation of syntactically unconstrained anaphors in Turkish heritage speakers." Second Language Research 36.4 (2020): 475-501.

Özge, Duygu, and Ebru Evcen. "Referential form, word order and emotional valence in Turkish pronoun resolution in physical contact events." Discourse Meaning: The View from Turkish 341 (2020): 165.

Performance score of F-Measure andKappa Statistics of the kNN algorithm in terms of parameter “k” change and distance weighted k-NN

Downloads

Published

27.05.2022

How to Cite

Abaci, H., Eminagaoglu, M., Gor, I., & Kılıcaslan, Y. (2022). Pronoun Resolution for Tweets in Turkish. International Journal of Intelligent Systems and Applications in Engineering, 10(2), 207–215. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/1847

Issue

Section

Research Article