An Efficient Filter Based Weighted Prior and Posterior Clustering Framework for Air Pollution Prediction
Keywords:
Clustering, Air Quality Index, Classification, OutliersAbstract
The problem of aggregating similar air quality indicators due to sample variability arises when the number of datasets detailing air pollution continues to grow. In order to estimate comparable severity levels out of scant and ambiguous data, traditional techniques of air quality clustering rely on static markers. Locating both local and global data samples for the purpose of severity categorization is an additional problem involved with the single cluster metric-based clustering approach that is currently being used. For the purpose of data clustering, this study presents a hybrid outlier identification model. The goal of this model is to determine which air quality characteristic values are the most severe. We recommend use a clustering technique that considers both local and global measures in order to classify the data that has been filtered. In the end, it all boils down to utilizing the categorization learning model in order to make a prediction regarding how poor the air quality would be. An outlier-based data clustering and classification framework has been shown to perform better than the conventional methods for predicting air quality, according to the results of an experiment conducted on a time series air pollution dataset.
Downloads
References
Yan, Y., Cao, L., & Rundensteiner, E. A. (2017, August):Scalable top-n local outlier detection. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1235-1244.
Hu, Z., Bodyanskiy, Y. V., Tyshchenko, O. K., & Boiko, O. O. (2018):A neuro-fuzzy Kohonen network for data stream possibilistic clustering and its online self-learning procedure. Applied soft computing, Vol. 68, pp. 710-718.
Li, W., Mo, W., Zhang, X., Squiers, J. J., Lu, Y., Sellke, E. W., & Thatcher, J. E. (2015):Outlier detection and removal improves accuracy of machine learning approach to multispectral burn diagnostic imaging. Journal of biomedical optics, 20(12), 121305.
Chen, L., Wang, W., & Yang, Y. (2021) : CELOF: Effective and fast memory efficient local outlier detection in high-dimensional data streams. Applied Soft Computing, 102, 107079.
Wang, B., & Mao, Z. (2019):Outlier detection based on Gaussian process with application to industrial processes. Applied Soft Computing, Vol.76, pp. 505-516.
Cruz, R. M., Sabourin, R., & Cavalcanti, G. D. (2017). META-DES. Oracle: Meta-learning and feature selection for dynamic ensemble selection. Information fusion, Vol. 38, pp. 84-103.
Santoyo, S. (2017):A brief overview of outlier detection techniques. Towards data science.
Boukerche, A., Zheng, L., & Alfandi, O. (2020):Outlier detection: Methods, models, and classification. ACM Computing Surveys (CSUR), 53(3), pp. 1-37.
Guansong Pang, Kai Ming Ting, and David Albrecht. 2015:LeSiNN: Detecting anomalies by identifying least similar nearest neighbours. In Proceedings of the 2015 IEEE International Conference on Data Mining Workshop (ICDMW'15). IEEE, 623–630.
Haizhou Du, Shengjie Zhao, Daqiang Zhang, and Jinsong Wu. 2016:Novel clustering-based approach for local outlier detection. In Proceedings of the 2016 IEEE Conference on Computer Communications Workshops. pp. 802–811.
W. Alahamade, I. Lake, C. E. Reeves, and B. De La Iglesia: “A multi-variate time series clustering approach based on intermediate fusion: A case study in air pollution data imputation,” Neurocomputing, Dec. 2021, doi: 10.1016/j.neucom.2021.09.079.
J. Song and M. E. J. Stettler:“A novel multi-pollutant space-time learning network for air pollution inference,” Science of The Total Environment, p. 152254, Dec. 2021, doi: 10.1016/j.scitotenv.2021.152254.
A.-L. Balogun, A. Tella, L. Baloo, and N. Adebisi:“A review of the inter-correlation of climate change, air pollution and urban sustainability using novel machine learning algorithms and spatial information science,” Urban Climate, vol. 40, p. 100989, Dec. 2021, doi: 10.1016/j.uclim.2021.100989.
P. Govender and V. Sivakumar:“Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019),” Atmospheric Pollution Research, vol. 11, no. 1, pp. 40–56, Jan. 2020, doi: 10.1016/j.apr.2019.09.009.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.