Improving Deepfake Audio Detection: A Support Vector Machine Approach with Mel-Frequency Cepstral Coefficients
Keywords:
Audio Analysis, Deepfake Detection, Feature Extraction, Media Manipulation, Mel-Frequency Cepstral Coefficients (MFCCs), Support Vector Machine (SVM)Abstract
This paper presents a machine learning system designed to differentiate real from synthetic speech using a Support Vector Machine (SVM) classifier. Trained on the 'for-original' Fake-or-Real (FoR) dataset, which consists of over 195,000 genuine and computer-generated utterances, the system uses Mel Frequency Cepstral Coefficients (MFCCs) to extract features. Evaluation results show a promising accuracy of 97.28%, indicating the system's potential efficacy in real-world applications. The work lays the foundation for future improvements in detection robustness and reliability by highlighting the significance of raw data in classifier training for deepfake detection.
Downloads
References
M. A. Khder, S. Shorman, D. T. Aldoseri, and M. M. Saeed, “Artificial Intelligence into Multimedia Deepfakes Creation and Detection,” in 2023 International Conference on IT Innovation and Knowledge Discovery, ITIKD 2023, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/ITIKD56332.2023.10099744.
O. A. Shaaban, R. Yildirim, and A. A. Alguttar, “Audio Deepfake Approaches,” IEEE Access, vol. 11, pp. 132652–132682, 2023, doi: 10.1109/ACCESS.2023.3333866.
H. H. Kilinc and F. Kaledibi, “Audio Deepfake Detection by using Machine and Deep Learning,” in Proceedings - 10th International Conference on Wireless Networks and Mobile Communications, WINCOM 2023, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/WINCOM59760.2023.10323004.
D. Cozzolino, A. Pianese, M. Nießner, and L. Verdoliva, “Audio-Visual Person-of-Interest DeepFake Detection,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, IEEE Computer Society, 2023, pp. 943–952. doi: 10.1109/CVPRW59228.2023.00101.
W. Yang et al., “AVoiD-DF: Audio-Visual Joint Learning for Detecting Deepfake,” IEEE Transactions on Information Forensics and Security, vol. 18, pp. 2015–2029, 2023, doi: 10.1109/TIFS.2023.3262148.
T. P. Doan, L. Nguyen-Vu, S. Jung, and K. Hong, “BTS-E: Audio Deepfake Detection Using Breathing-Talking-Silence Encoder,” in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/ICASSP49357.2023.10095927.
R. L. M. A. P. C. Wijethunga, D. M. K. Matheesha, A. Al Noman, K. H. V. T. A. De Silva, M. Tissera, and L. Rupasinghe, “Deepfake audio detection: A deep learning based solution for group conversations,” in ICAC 2020 - 2nd International Conference on Advancements in Computing, Proceedings, Institute of Electrical and Electronics Engineers Inc., Dec. 2020, pp. 192–197. doi: 10.1109/ICAC51239.2020.9357161.
A. Hamza et al., “Deepfake Audio Detection via MFCC features using Machine Learning,” IEEE Access, 2022, doi: 10.1109/ACCESS.2022.3231480.
G. Ulutas, G. Tahaoglu, and B. Ustubioglu, “Deepfake audio detection with vision transformer based method,” in 2023 46th International Conference on Telecommunications and Signal Processing, TSP 2023, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 244–247. doi: 10.1109/TSP59544.2023.10197715.
B. F. Nasar, T. Sajini, and E. R. Lason, “Deepfake Detection in Media Files - Audios, Images and Videos,” in 2020 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2020, Institute of Electrical and Electronics Engineers Inc., Dec. 2020, pp. 74–79. doi: 10.1109/RAICS51191.2020.9332516.
B. Kumar and S. R. Alraisi, “Deepfakes Audio Detection Techniques Using Deep Convolutional Neural Network,” in 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing, COM-IT-CON 2022, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 463–468. doi: 10.1109/COM-IT-CON54601.2022.9850771.
Z. Lv, S. Zhang, K. Tang, and P. Hu, “FAKE AUDIO DETECTION BASED ON UNSUPERVISED PRETRAINING MODELS,” in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 9231–9235. doi: 10.1109/ICASSP43922.2022.9747605.
M. Li and X. P. Zhang, “Robust Audio Anti-Spoofing System Based on Low-Frequency Sub-Band Information,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/WASPAA58266.2023.10248132.
M. Li, Y. Ahmadiadli, and X.-P. Zhang, “Robust Deepfake Audio Detection via Bi-Level Optimization,” Institute of Electrical and Electronics Engineers (IEEE), Dec. 2023, pp. 1–6. doi: 10.1109/mmsp59012.2023.10337724.
A. Khan and K. M. Malik, “Securing Voice Biometrics: One-Shot Learning Approach for Audio Deepfake Detection,” in 2023 IEEE International Workshop on Information Forensics and Security (WIFS), IEEE, Dec. 2023, pp. 1–6. doi: 10.1109/WIFS58808.2023.10374968.
P. Kawa, M. Plata, and P. Syga, “SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection,” in Proceedings - 2022 IEEE 21st International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2022, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 792–799. doi: 10.1109/TrustCom56396.2022.00111.
L. Wang, B. Yeoh, and J. W. Ng, “Synthetic Voice Detection and Audio Splicing Detection using SE-Res2Net-Conformer Architecture,” in 2022 13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 115–119. doi: 10.1109/ISCSLP57327.2022.10037999.
I. Altalahin, S. Alzu’Bi, A. Alqudah, and A. Mughaid, “Unmasking the Truth: A Deep Learning Approach to Detecting Deepfake Audio Through MFCC Features,” in 2023 International Conference on Information Technology: Cybersecurity Challenges for Sustainable Cities, ICIT 2023 - Proceeding, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 511–518. doi: 10.1109/ICIT58056.2023.10226172.
A. Khovrat and V. Kobziev, “Using Recurrent and Convulation Neural Networks to Indentify the Fake Audio Messages,” in 2023 IEEE 7th International Conference on Methods and Systems of Navigation and Motion Control, MSNMC 2023 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 174–177. doi: 10.1109/MSNMC61017.2023.10329236.
M. McUba, A. Singh, R. A. Ikuesan, and H. Venter, “The effect of deep learning methods on deepfake audio detection for digital investigation,” in Procedia Computer Science, Elsevier B.V., 2023, pp. 211–219. doi: 10.1016/j.procs.2023.01.283.
D. Salvi et al., “A Robust Approach to Multimodal Deepfake Detection,” J Imaging, vol. 9, no. 6, Jun. 2023, doi: 10.3390/jimaging9060122.
C. Doss et al., “Deepfakes and scientific knowledge dissemination,” Sci Rep, vol. 13, no. 1, Dec. 2023, doi: 10.1038/s41598-023-39944-3.
Members of APTLY lab, “Fake-or-Real Audio Dataset.” Accessed: Jan. 20, 2024. [Online]. Available: https://www.eecs.yorku.ca/~bil/Datasets/for-original.tar.gz
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.