Dataset Normalization in Cricket Score Prediction Using Weighted K-Means Clustering


  • M. Chandru, S. Prasath


Cricket Score Prediction, Feature Selection, Machine Learning, Weighted K-Means Clustering


Cricket, as a highly dynamic and unpredictable sport, presents a unique challenge for accurate score prediction. This study proposes a novel approach to cricket score prediction by integrating machine learning techniques with feature selection through weighted k-means clustering. The goal is to enhance the predictive accuracy by identifying and leveraging the most relevant features from a pool of diverse cricket match attributes. The methodology begins with the collection of comprehensive cricket match data, including player statistics, team performance metrics, and match conditions. These features form the basis for building a predictive model. To address the challenge of feature selection, weighted k-means clustering is employed. This technique assigns weights to features based on their importance, ensuring that the model focuses on the most influential variables. The dataset is preprocessed to handle missing values, normalize data, and address outliers. The preprocessed data is then subjected to weighted k-means clustering, where features are grouped into clusters, and weights are assigned based on the intrinsic significance of each feature within its cluster. This ensures that the model prioritizes features with higher weights during the prediction process. The machine learning model is constructed using an ensemble of algorithms, such as decision trees, random forests, and gradient boosting, to harness the collective power of diverse approaches. The selected features from the weighted k-means clustering are incorporated into the model, enhancing its ability to capture the intricate patterns inherent in cricket matches.


Download data is not yet available.


