Code Plagiarism and Originality Detection using Machine Learning for Ethical Code Practices
Keywords:
Code Analysis, Code Originality System, Code Similarity, Ethical Coding Practices, Plagiarism DetectionAbstract
The design and development of a Code Originality System—a sophisticated software solution aimed at preserving the intellectual property rights of developers, upholding code quality, and promoting ethical coding practices. The primary goal of this research is to create a Code Originality System that utilizes various algorithms to analyze code similarity and detect plagiarism. Emphasis is placed on safeguarding intellectual property, ensuring code quality, and encouraging ethical coding practices. Utilizing token-based approaches and advanced machine learning models, the Code Originality System addresses challenges of code diversity, scalability, privacy, and algorithmic precision. The research emphasizes a pivotal role in safeguarding code integrity, offering insights into architectural components, customizable features, and integration capabilities. The study presents a robust Code Originality System, revealing its effectiveness in tackling challenges and underscoring its role in fostering innovation. The findings, supported by conclusive statistical data, highlight the system's uniqueness and its contribution to responsible and ethical software development practices. This research pioneers a Code Originality System, providing a critical stride towards a future defined by responsible and ethical software development practices.
Downloads
References
Cosma G, Joy M. Source-code plagiarism: A UK academic perspective. In: The 7th Annual Conference of the HEA Network for Information and Computer Sciences. HEA Network for Information and ComputerSciences; 2006.
Cosma G, Joy M. Towards a definition of source-code plagiarism. IEEE Trans Educ. 2008;51(2):195–200.
Culwin F, MacLeod A, Lancaster T. Source Code Plagiarism in UK HE Computing Schools, Issues, Attitudes, and Tools. South Bank University, London; 2001.
Đurić Z, Gašević D. A source code similarity system for plagiarism detection. Comput J. 2013;56(1):70–86.
Joy M, Cosma G, Yau JY-K, Sinclair J. Source code plagiarism – a student perspective. IEEE Trans Educ.2011;54(1):125–132.
Hage J, Rademaker P, Vugt N. A Comparison of Plagiarism Detection Tools. Department of Information and Computing Sciences, Utrecht University. 2014.
Joy M, Luck M. Plagiarism in programming assignments. IEEE Trans Educ. 1999;42(2):129–133.
Lancaster T. Effective and Efficient Plagiarism Detection. PhD Thesis, South Bank University, London; 2003.Availablefrom: http://www.academia.edu/168972/Effective_and_Efficient_Plagiarism_Detection
Lancaster T, Culwin F. Using freely available tools to produce a partially automated plagiarism. In: Proc. of the 21st ASCILITE Conference, Perth, Australia; 2004. p. 520–529.
Wang C, Xu H, Zhang D. Copyright issues in code similarity detection: An empirical study on GitHub. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management.ACM;2019.
Burrows S, Tahaghoghi SMM, Zobel J. Efficient plagiarism detection for large code repositories. Softw Pract Exper. 2007;37(2):151–175.
Lancaster T, Culwin F. Classifications of plagiarism detection engines. Innov Teach Learn Inf Comput Sci. 2005;4(2).
Mozgovoy M. Desktop tools for offline plagiarism detection in computer programs. Informatics Educ. 2006;5(1):97–112.
Mozgovoy M, Fredriksson K, White D, Joy M, Sutien E. Fast plagiarism detection system. In: SPIRE’05, Buenos Aires, Argentina; 2005. p. 267–270.
Prechelt L, Malpohl G, Philippsen M. Finding plagiarisms among a set of programs with JPlag. J Universal Computer Sci. 2002;8(11):1016–1038.
Prechelt L, Malpohl G, Phlippsen M. Finding Plagiarisms Among a Set of Programs. Universität Karlsruhe, Fakultültät für Informatik; 2000. Available from: http://page.mi.fu-berlin.de/~prechelt/Biblio/jplagTR.pdf
Saini R, Sukhwani A, Ghose AK. Code plagiarism detection using machine learning techniques. Int J Comput Appl. 2017;178(41):22–28.
Lavesson N, Samuelsson C. A survey of privacy in code analysis. J Privacy Confidentiality. 2018;9(2).
Martin B. Plagiarism: a misplaced emphasis. J Inf Ethics. 1994;3(2):36–47.
Ahtiainen A, Surakka S, Rahikainen M. Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises. In: Baltic sea ’06; 2006. p. 141–142.
Baxter ID, Yahin A, Moura L, Sant’Anna M, Bier L. Clone detection using abstract syntax trees. In: ICSM’98; 1998. p. 368–377.
Fowler M. Catalog of refactorings. 2013. Available from: https://refactoring.com/catalog/
Kamiya T, Kusumoto S, Inoue K. CCFInder: a multilinguistic token-based code clone detection system for large scale source code. Trans Softw Eng. 2002;28(7):654–670.
Kapser C. Godfrey m: Cloning considered harmful” considered harmful. In: 2006 13th working conference on reverse engineering; 2006. p. 19–28.
Udupa SK, Debray SK, Madou M (2005) Deobfuscation: reverse engineering obfuscated code. In: WCRE ’05, pp 45–56
United States District Court (2011) Oracle America, Inc. v. Google Inc., No. 3:2010cv03561 – Document 642 (N.D. Cal. 2011).
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.