Dataset Normalization in Cricket Score Prediction Using Weighted K-Means Clustering
Keywords:
Cricket Score Prediction, Feature Selection, Machine Learning, Weighted K-Means ClusteringAbstract
Cricket, as a highly dynamic and unpredictable sport, presents a unique challenge for accurate score prediction. This study proposes a novel approach to cricket score prediction by integrating machine learning techniques with feature selection through weighted k-means clustering. The goal is to enhance the predictive accuracy by identifying and leveraging the most relevant features from a pool of diverse cricket match attributes. The methodology begins with the collection of comprehensive cricket match data, including player statistics, team performance metrics, and match conditions. These features form the basis for building a predictive model. To address the challenge of feature selection, weighted k-means clustering is employed. This technique assigns weights to features based on their importance, ensuring that the model focuses on the most influential variables. The dataset is preprocessed to handle missing values, normalize data, and address outliers. The preprocessed data is then subjected to weighted k-means clustering, where features are grouped into clusters, and weights are assigned based on the intrinsic significance of each feature within its cluster. This ensures that the model prioritizes features with higher weights during the prediction process. The machine learning model is constructed using an ensemble of algorithms, such as decision trees, random forests, and gradient boosting, to harness the collective power of diverse approaches. The selected features from the weighted k-means clustering are incorporated into the model, enhancing its ability to capture the intricate patterns inherent in cricket matches.
Downloads
References
Anik, A. I., Yeaser, S., Hossain, A. G. M. I., & Chakrabarty, A. (2018). Player’s Performance Prediction in ODI Cricket Using Machine Learning Algorithms. 2018 4th International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT). doi:10.1109/ceeict.2018.8628118
Basit, A., Alvi, M. B., Jaskani, F. H., Alvi, M., Memon, K. H., & Shah, R. A. (2020). ICC T20 Cricket World Cup 2020 Winner Prediction Using Machine Learning Techniques. 2020 IEEE 23rd International Multitopic Conference (INMIC). doi:10.1109/inmic50486.2020.9318077
Emon, S. H., Annur, A. H. ., Xian, A. H., Sultana, K. M., & Shahriar, S. M. (2020). Automatic Video Summarization from Cricket Videos Using Deep Learning. 2020 23rd International Conference on Computer and Information Technology (ICCIT). doi:10.1109/iccit51783.2020.9392707
Faruque, M. A., Rahman, S., Chakraborty, P., Choudhury, T., Um, J.-S., & Singh, T. P. (2021). Ascertaining polarity of public opinions on Bangladesh cricket using machine learning techniques. Spatial Information Research. doi:10.1007/s41324-021-00403-8
Fiaidhi, J., Bhattacharyya, D., & Rao, N. T. (Eds.). (2020). Smart Technologies in Data Science and Communication. Lecture Notes in Networks and Systems. doi:10.1007/978-981-15-2407-3
Hatharasinghe, M. M., & Poravi, G. (2019). Data Mining and Machine Learning in Cricket Match Outcome Prediction: Missing Links. 2019 IEEE 5th International Conference for Convergence in Technology (I2CT). doi:10.1109/i2ct45611.2019.9033698
I.M. Devi and S. Juliet, "Game Statistics Forecast Based on Sports Using Machine Learning," 2023 International Conference on Circuit Power and Computing Technologies (ICCPCT), Kollam, India, 2023, pp. 645-650, doi: 10.1109/ICCPCT58313.2023.10245637.
Iyer, G. N., Vignesh S, B., Sohan, B., R, D., & Raman, V. (2020). Automated Third Umpire Decision Making in Cricket Using Machine Learning Techniques. 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS). doi:10.1109/iciccs48265.2020.9121078
Jadhav, R., Pawar, B., Bhat, N., Kawale, S., & Gawai, A. (2021). Predicting Optimal Cricket Team using Data Analysis. 2021 International Conference on Emerging Smart Computing and Informatics (ESCI). doi:10.1109/esci50559.2021.9396861
Jhansi Rani, P., Vidyadhar Kamath, A., Menon, A., Dhatwalia, P., Rishabh, D., & Kulkarni, A. (2020). Selection of Players and Team for an Indian Premier League Cricket Match Using Ensembles of Classifiers. 2020 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT). doi:10.1109/conecct50063.2020.9198371
M. Sumathi, S. Prabu and M. Rajkamal, "Cricket Players Performance Prediction and Evaluation Using Machine Learning Algorithms," 2023 International Conference on Networking and Communications (ICNWC), Chennai, India, 2023, pp. 1-6, doi: 10.1109/ICNWC57852.2023.10127503.
Modani, N., Kilaru, M., Kaur, A., Sinha, R., & Khetan, H. (2020). Predicting Outcomes in Limited-Overs Cricket Matches. Proceedings of the 7th ACM IKDD CoDS and 25th COMAD. doi:10.1145/3371158.3371166
Rahman, R., Rahman, M. A., Islam, M. S., & Hasan, M. (2021). DeepGrip: Cricket Bowling Delivery Detection with Superior CNN Architectures. 2021 6th International Conference on Inventive Computation Technologies (ICICT). doi:10.1109/icict50816.2021.9358572
Raj, J. S., Iliyasu, A. M., Bestak, R., & Baig, Z. A. (Eds.). (2021). Innovative Data Communication Technologies and Application. Lecture Notes on Data Engineering and Communications Technologies. doi:10.1007/978-981-15-9651-3
Shingrakhia, H., & Patel, H. (2021). SGRNN-AM and HRF-DBN: a hybrid machine learning model for cricket video summarization. The Visual Computer. doi:10.1007/s00371-021-02111-8
Shukla, R. K., Agrawal, J., Sharma, S., Chaudhari, N. S., & Shukla, K. K. (Eds.). (2020). Social Networking and Computational Intelligence. Lecture Notes in Networks and Systems. doi:10.1007/978-981-15-2071-6
Smys, S., Balas, V. E., Kamel, K. A., & Lafata, P. (Eds.). (2021). Inventive Computation and Information Technologies. Lecture Notes in Networks and Systems. doi:10.1007/978-981-33-4305-4
Tyagi, S., Kumari, R., Makkena, S. C., Mishra, S. S., & Pendyala, V. S. (2020). Enhanced Predictive Modeling of Cricket Game Duration Using Multiple Machine Learning Algorithms. 2020 International Conference on Data Science and Engineering (ICDSE). doi:10.1109/icdse50459.2020.9310081
V. V. Tharoor and N. M. Dhanya, "Performance of Indian Cricket Team in Test Cricket: A comprehensive Data Science analysis," 2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC), Chennai, India, 2022, pp. 128-133, doi: 10.1109/ICESIC53714.2022.9783492.
Vetukuri, V. S., Sethi, N., & Rajender, R. (2020). Generic model for automated player selection for cricket teams using recurrent neural networks. Evolutionary Intelligence, 14(2), 971–978. doi:10.1007/s12065-020-00488-4
Ikotun, A.M., Ezugwu, A.E., Abualigah, L., Abuhaija, B. and Heming, J., 2023. K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data. Information Sciences, 622, pp.178-210.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.