An Advanced Document Representation Technique Based Approach for Author Profiles Prediction using Word Embedding Techniques

Authors

  • D. Radha Department CSE, Gitam university, Visakhapatnam, India
  • P. Chandra Shekhar Department of CSE , Gitam university, Visakhapatnam, India

Keywords:

Word Embedding Techniques, Term Weight Measures, deep larning, Gender Prediction, Age Prediction

Abstract

The internet has become exponentially larger and unmanageably fast, the main contributing factor can be attributed due to how many people utilise social media, blogs, as well as online reviews. The most of the information given was published in various settings by multiple writers. The abundance of the information challenged Academics & information analysts collaborate to develop automatic methodologies for evaluating such content. Author Profiling is a widely employed approach in scholarly research, wherein scholars analyse the writing styles of authors to extract maximum information from texts. Author profiling is a methodology employed in the field of text categorization used to identify writers by their written works and forecast their demographic attributes, such as gender, age, native language, schooling, location, as well as personality traits. In today's information age, author profiling is a crucial approach with applications in forensic investigation, security, and marketing. Social media platforms have a substantial influence on our daily existence. and are a source of crimes, including public humiliation, fraudulent profiles, defamation, blackmail, stalking, etc. Author profiling helps the educational field by examining a big group of students. It aids in exposing the pupils' extraordinary potential. The educational forum also aids in determining the optimal amount of knowledge for individual students or groups of students. The majority of individuals of author profiling techniques employed a variety of Various criteria, encompassing linguistic factors, Various writing styles can be told apart by their content-based features, structure features, syntactic features, as well as semantic features. The present ones and models did there is no evidence to suggest that the enhancement of profile prediction accuracy has been achieved. They utilised new methods to improve the accuracy of demographic predictions for word embedding that are rooted in document analysis representation technique, it offers a new collection of style characteristics, feature selection algorithms, word weight measures, and the gender & age prediction models achieved accuracies of 0.9439 and 0.8945, respectively. The present study employs the PAN Competition 2014 evaluations database to do gender & age prediction. The experimental results are outperforming the earlier models and superior in estimation level.

Downloads

Download data is not yet available.

References

Rangel, F., & Rosso, P. (2013). Use of language and author profiling: Identification of gender and age. Natural Language Processing and Cognitive Science, 177.

Rangel, F., & Rosso, P. (2016). On the impact of emotions on author profiling. Information processing & management, 52(1), 73-92.

Mishra, P., Del Tredici, M., Yannakoudakis, H., & Shutova, E. (2018). Author profiling for abuse detection. In Proceedings of the 27th international conference on computational linguistics (pp. 1088-1098). Association for Computational Linguistics (ACL).

Santosh, K., Bansal, R., Shekhar, M., & Varma, V. (2013). Author profiling: Predicting age and gender from blogs. Notebook for PAN at CLEF, 2013(2).

Weren, E. R., Kauer, A. U., Mizusaki, L., Moreira, V. P., de Oliveira, J. P. M., & Wives, L. K. (2014). Examining multiple features for author profiling. Journal of information and data management, 5(3), 266-266.

Rangel, F., Rosso, P., Potthast, M., Stein, B., & Daelemans, W. (2015, September). Overview of the 3rd Author Profiling Task at PAN 2015. In CLEF (p. 2015). sn.

Pennebaker, J. W., Mehl, M. R., & Niederhoffer, K. (2003). Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology. (54): 547– 577.

Estival, D., Gaustad, T., Pham, S. B., Radford, W., & Hutchinson, B. (2007, September). Author profiling for English emails. In Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics (Vol. 263, p. 272).

Alsmearat, K., Shehab, M., Al-Ayyoub, M., Al-Shalabi, R., & Kanaan, G. (2015, November). Emotion analysis of arabic articles and its impact on identifying the author's gender. In 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA) (pp. 1-6). IEEE.

AlSukhni, E., & Alequr, Q. (2016). Investigating the use of machine learning algorithms in detecting gender of the Arabic tweet author. International Journal of Advanced Computer Science and Applications, 7(7).

Burger, J. D., Henderson, J., Kim, G., & Zarrella, G.,“Discriminating gender on Twitter”. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 1301-1309, 2011.

Reddy, D. H., & Sirisha, N. (2022). Multifactor Authentication Key Management System based Security Model Using Effective Handover Tunnel with IPV6. International Journal of Communication Networks and Information Security, 14(2), 273-284.

Reddy, D. A., Shambharkar, S., Jyothsna, K., Kumar, V. M., Bhoyar, C. N., & Somkunwar, R. K. (2022, September). Automatic Vehicle Damage Detection Classification framework using Fast and Mask Deep learning. In 2022 Second International Conference on Computer Science, Engineering and Applications (ICCSEA) (pp. 1-6). IEEE.

Kumar, A., Janakirani, M., Anand, M., Sharma, S., Vivekanand, C. V., & Chakravarti, A. (2022). Comparative Performance Study of Difference Differential Amplifier Using 7 nm and 14 nm FinFET Technologies and Carbon Nanotube FET. Journal of Nanomaterials, 2022.

Vijetha, T., Mallick, P. S., Karthik, R., & Rajan, K. (2022). Effect of Scattering Angle in Electron Transport of AlGaN and InGaN. Advances in Materials Science and Engineering, 2022.

Kumar, H., Prasad, R., Kumar, P., & Hailu, S. A. (2022). Friction and Wear Response of Friction Stir Processed Cu/ZrO 2 Surface Nano-Composite. Journal of Nanomaterials, 2022.

Bhanuprakash, Lokasani, Soney Varghese, and Sitesh Kumar Singh. "Glass Fibre Reinforced Epoxy Composites Modified with Graphene Nanofillers: Electrical Characterization." Journal of Nanomaterials (2022).

Jangam, N. R., Guthikinda, L., & Ramesh, G. P. (2022). Design and Analysis of New Ultra Low Power CMOS Based Flip-Flop Approaches. In Distributed Computing and Optimization Techniques: Select Proceedings of ICDCOT 2021 (pp. 295-302). Singapore: Springer Nature Singapore.

Chandrasekaran, S., Satyanarayana Gupta, M., Jangid, S., Loganathan, K., Deepa, B., & Chaudhary, D. K. (2022). Unsteady radiative Maxwell fluid flow over an expanding sheet with sodium alginate water-based copper-graphene oxide hybrid nanomaterial: an application to solar aircraft. Advances in Materials Science and Engineering, 2022.

Shareef, S. K., Sridevi, R., Raju, V. R., & Rao, K. S. (2022). An Intelligent Secure Monitoring Phase in Blockchain Framework for Large Transaction. IJEER, 10(3), 536-543.

Koppula, N., Rao, K. S., Nabi, S. A., & Balaram, A. (2023). A novel optimized recurrent network-based automatic system for speech emotion identification. Wireless Personal Communications, 128(3), 2217-2243.

Ali, F., Kumar, T. A., Loganathan, K., Reddy, C. S., Pasha, A. A., Rahman, M. M., & Al-Farhany, K. (2023). Irreversibility analysis of cross fluid past a stretchable vertical sheet with mixture of Carboxymethyl cellulose water based hybrid nanofluid. Alexandria Engineering Journal, 64, 107-118.

Pittala, C. S., Vijay, V., & Reddy, B. N. K. (2022). 1-Bit FinFET carry cells for low voltage high-speed digital signal processing applications. Silicon, 1-12.

Saran, O. S., Reddy, A. P., Chaturya, L., & Kumar, M. P. (2022). 3D printing of composite materials: A short review. Materials Today: Proceedings.

Dasari, K., Anjaneyulu, L., & Nadimikeri, J. (2022). Application of C-band sentinel-1A SAR data as proxies for detecting oil spills of Chennai, East Coast of India. Marine Pollution Bulletin, 174, 113182.

Rao, A. D., Chaitanya, A. K., Seshaiah, T., & Bridjesh, P. (2022). An Integrated Approach by Using Various Approaches for a Green Supplier Selection Problem. In Recent Advances in Manufacturing, Automation, Design and Energy Technologies: Proceedings from ICoFT 2020 (pp. 909-919). Springer Singapore.

Yakaiah, P., & Naveen, K. (2022). An Approach for Ultrasound Image Enhancement Using Deep Convolutional Neural Network. In Advanced Techniques for IoT Applications: Proceedings of EAIT 2020 (pp. 86-92). Springer Singapore.

Kumar, C. A., & Haribabu, K. (2022). A Great Adaptive SNR Assumed Low Power LDPC Decoder. In Advanced Techniques for IoT Applications: Proceedings of EAIT 2020 (pp. 443-451). Springer Singapore.

Thottempudi, P., Dasari, V. S. C. B., & Sista, V. S. P. (2022). Recognition of Moving Human Targets by Through the Wall Imaging RADAR Using RAMA and SIA Algorithms. In Advanced Techniques for IoT Applications: Proceedings of EAIT 2020 (pp. 544-563). Springer Singapore.

Arun, V., Reddy, D. L., & Rao, K. N. (2022). A Novel Analysis of Efficient Energy Architecture in Cryptography. In Advanced Techniques for IoT Applications: Proceedings of EAIT 2020 (pp. 339-345). Springer Singapore.

Amareswer, E., & Raju Naik, M. (2022). Smart Erobern of Vehicles on Crosswalks. In Advanced Techniques for IoT Applications: Proceedings of EAIT 2020 (pp. 489-497). Springer Singapore.

Saikumar, K., Rajesh, V., Babu, B.S. (2022). Heart disease detection based on feature fusion technique with augmented classification using deep learning technology. Traitement du Signal, Vol. 39, No. 1, pp. 31-42. https://doi.org/10.18280/ts.390104

Kailasam, S., Achanta, S.D.M., Rama Koteswara Rao, P., Vatambeti, R., Kayam, S. (2022). An IoT-based agriculture maintenance using pervasive computing with machine learning technique. International Journal of Intelligent Computing and Cybernetics, 15(2), pp. 184–197.

Saikumar, K., Rajesh, V. A machine intelligence technique for predicting cardiovascular disease (CVD) using Radiology Dataset. Int J Syst Assur Eng Manag (2022). https://doi.org/10.1007/s13198-022-01681-7.

Shravani, C., Krishna, G. R., Bollam, H. L., Vatambeti, R., & Saikumar, K. (2022, January). A Novel Approach for Implementing Conventional LBIST by High Execution Microprocessors. In 2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT) (pp. 804-809). IEEE.

Kiran, K. U., Srikanth, D., Nair, P. S., Ahammad, S. H., & Saikumar, K. (2022, March). Dimensionality Reduction Procedure for Bigdata in Machine Learning Techniques. In 2022 6th International Conference on Computing Methodologies and Communication (ICCMC) (pp. 836-840). IEEE.

Srinivas Rao, K., Divakara Rao, D. V., Patel, I., Saikumar, K., & Vijendra Babu, D. (2023). Automatic Prediction and Identification of Smart Women Safety Wearable Device Using Dc-RFO-IoT. Journal of Information Technology Management, 15(Special Issue), 34-51.

Sreelakshmi, D., Sarada, K., Sitharamulu, V., Vadlamudi, M. N., & Saikumar, K. (2023). An Advanced Lung Disease Diagnosis Using Transfer Learning Method for High-Resolution Computed Tomography (HRCT) Images: High-Resolution Computed Tomography. In Digital Twins and Healthcare: Trends, Techniques, and Challenges (pp. 119-130). IGI Global.

Saikumar, K., Rajesh, V., & Rahman, M. Z. U. (2022). Pretrained DcAlexnet Cardiac Diseases Classification on Cognitive Multi-Lead Ultrasound Dataset. International Journal of Integrated Engineering, 14(7), 146-161.

Maddileti, T., Sirisha, J., Srinivas, R., & Saikumar, K. (2022). Pseudo Trained YOLO R_CNN Model for Weapon Detection with a Real-Time Kaggle Dataset. International Journal of Integrated Engineering, 14(7), 131-145.

Downloads

Published

05.12.2023

How to Cite

Radha, D. ., & Shekhar, P. C. . (2023). An Advanced Document Representation Technique Based Approach for Author Profiles Prediction using Word Embedding Techniques . International Journal of Intelligent Systems and Applications in Engineering, 12(7s), 377–393. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/4081

Issue

Section

Research Article