Clustering of Mitochondrial D-loop Sequences Using Similarity Matrix, PCA and K-means Algorithm
AbstractIn this study, mitochondrial displacement-loop (D-loop) sequences isolated from different hominid species are clustered using similarity matrix, Principal Component Analysis (PCA) and K-means algorithm. Firstly, the mitochondrial D-loop sequence data are retrieved from the GenBank database and copied into MATLAB. Pairwise distances are computed using p distance and Jukes-Cantor methods. A phylogenetic tree is created and then a similarity matrix is generated according to the pairwise distances. Furthermore, the clustering is performed using only K-means algorithm. After that PCA and K-means are used together in order to cluster mitochondrial D-loop sequences.
H. Zischler, H. Geisert, A. Von Haeseler, and S. Pääbo, “A nuclear 'fossil' of the mitochondrial D-loop and the origin of modern humans,” Nature, vol. 378, no. 6556, pp. 489–492, November 1995.
W. M. Brown, E. M. Prager, A. Wang, and A. C. Wilson, “Mitochondrial DNA sequences of primates: tempo and mode of evolution,” Journal of Molecular Evolution, vol. 18, no. 4, pp. 225–239, July 1982.
D. R. Maddison, M. Ruvolo, and D. L. Swofford, “Geographic origins of human mitochondrial DNA phylogenetic inference from control region sequences,” Systematic Biology, vol. 41, no. 1, pp. 111−124, 1992.
A. R. Hoelzel, J. M. Hancock, and G. A. Dover, “Evolution of the Cetacean Mitochondrial D-Loop Region,” Molecular Biology and Evolution, vol. 8, no. 3, pp. 475−493, 1991.
W. M. Brown, “The mitochondrial genome of animals,” MacIntyre RJ (ed) Molecular Evolutionary Genetics, Plenum Press, New York, pp. 95−130, 1985.
A. C. Wilson, R. L. Cann, S. M. Carr, M. George, U. B. Gyllensten, K. M. Helm-Bychowski, R. G. Higuchi, S. R. Palumbi, E. M. Prager, R. D. Sage, and M. Stoneking, “Mitochondrial DNA and two perspectives on evolutionary genetics,” Biological Journal of the Linnean Society, vol. 26, no. 4, pp. 375−400, December 1985.
W. B. Upholt and I. B. Dawid, “Mapping of mitochondrial DNA of individual sheep and goats: rapid evolution in the D loop region,” Cell, vol. 11, no. 3, pp. 571−583, July 1977.
M. W. Walberg and D. A. Clayton, “Sequence and properties of the human KB cell and mouse L cell D-loop regions of mitochondrial DNA,” Nucleic Acids Research, vol. 9, no. 20, pp. 5411−5421, October 1981.
D. Chang and D. A. Clayton, “Priming of human mitochondrial DNA replication occurs at the light-strand promoter,” Proceedings of the National Academy of Sciences of the United States of America, vol. 82, no. 2, pp. 351−355, January 1985.
C. Eyupoglu, “Implementation of Color Face Recognition Using PCA and k-NN Classifier,” 2016 IEEE NW Russia Young Researchers in Electrical and Electronic Engineering Conference (ElConRusNW), pp. 199−202, St. Petersburg, Russia, 2–3 February 2016.
X. Xiang, J. Yang, and Q. Chen, “Color face recognition by PCA-like approach,” Neurocomputing, vol. 152, pp. 231−235, March 2015.
D. Wei and Q. Jiang, “A DNA Sequence Distance Measure Approach for Phylogenetic Tree Construction,” 2010 IEEE Fifth International Conference Bio-Inspired Computing: Theories and Applications (BIC-TA), pp. 204−212, Changsha, 23–26 September 2010.
P. Bhambri and O. P. Gupta, “Development of Phylogenetic Tree Based on Kimura's Method,” 2012 2nd IEEE International Conference on Parallel Distributed and Grid Computing (PDGC), pp. 721–723, Solan, 6–8 December 2012.
S. S. Patil, V. Kumar, V. R. Pai, and A. K. Patil, “Constructing phylogenetic tree and analysis using information retrieval approach for MYB tfr's of rice genome,” 2015 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE), pp. 523–529, Dhaka, 19–20 December 2015.
J. MacQueen, “Some methods for classification and analysis of multivariate observations,” Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297, Berkeley, University of California Press, 1967.
W. K. Daniel Pun and A. B. M. Shawkat Ali, “Unique Distance Measure Approach for K-means (UDMA-Km) Clustering Algorithm,” 2007 IEEE Region 10 Conference (TENCON), pp. 1–4, Taipei, 30 October–2 November 2007.
Copyright (c) 2018 International Journal of Intelligent Systems and Applications in Engineering
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.