Improving Deepfake Audio Detection: A Support Vector Machine Approach with Mel-Frequency Cepstral Coefficients

Authors

  • Shwetambari Borade Assistant Professor Shah & Anchor Kutchhi Engineering College, Chembur, Mumbai, Maharashtra, India
  • Nilakshi Jain Professor, Shah & Anchor Kutchhi Engineering College, Chembur, Mumbai, Maharashtra, India
  • Bhavesh Patel Professor, Shah & Anchor Kutchhi Engineering College, Chembur, Mumbai, Maharashtra, India
  • Vineet Kumar Founder & Global President, CyberPeace Foundation, Delhi, India
  • Mustansir Godhrawala Student, Shah & Anchor Kutchhi Engineering College, Chembur, Mumbai, Maharashtra, India
  • Shubham Kolaskar Student, Shah & Anchor Kutchhi Engineering College, Chembur, Mumbai, Maharashtra, India
  • Yash Nagare Student, Shah & Anchor Kutchhi Engineering College, Chembur, Mumbai, Maharashtra, India
  • Pratham Shah Student, Shah & Anchor Kutchhi Engineering College, Chembur, Mumbai, Maharashtra, India
  • Jayan Shah Student, Shah & Anchor Kutchhi Engineering College, Chembur, Mumbai, Maharashtra, India

Keywords:

Audio Analysis, Deepfake Detection, Feature Extraction, Media Manipulation, Mel-Frequency Cepstral Coefficients (MFCCs), Support Vector Machine (SVM)

Abstract

This paper presents a machine learning system designed to differentiate real from synthetic speech using a Support Vector Machine (SVM) classifier. Trained on the 'for-original' Fake-or-Real (FoR) dataset, which consists of over 195,000 genuine and computer-generated utterances, the system uses Mel Frequency Cepstral Coefficients (MFCCs) to extract features. Evaluation results show a promising accuracy of 97.28%, indicating the system's potential efficacy in real-world applications. The work lays the foundation for future improvements in detection robustness and reliability by highlighting the significance of raw data in classifier training for deepfake detection.

Downloads

Download data is not yet available.

References

M. A. Khder, S. Shorman, D. T. Aldoseri, and M. M. Saeed, “Artificial Intelligence into Multimedia Deepfakes Creation and Detection,” in 2023 International Conference on IT Innovation and Knowledge Discovery, ITIKD 2023, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/ITIKD56332.2023.10099744.

O. A. Shaaban, R. Yildirim, and A. A. Alguttar, “Audio Deepfake Approaches,” IEEE Access, vol. 11, pp. 132652–132682, 2023, doi: 10.1109/ACCESS.2023.3333866.

H. H. Kilinc and F. Kaledibi, “Audio Deepfake Detection by using Machine and Deep Learning,” in Proceedings - 10th International Conference on Wireless Networks and Mobile Communications, WINCOM 2023, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/WINCOM59760.2023.10323004.

D. Cozzolino, A. Pianese, M. Nießner, and L. Verdoliva, “Audio-Visual Person-of-Interest DeepFake Detection,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, IEEE Computer Society, 2023, pp. 943–952. doi: 10.1109/CVPRW59228.2023.00101.

W. Yang et al., “AVoiD-DF: Audio-Visual Joint Learning for Detecting Deepfake,” IEEE Transactions on Information Forensics and Security, vol. 18, pp. 2015–2029, 2023, doi: 10.1109/TIFS.2023.3262148.

T. P. Doan, L. Nguyen-Vu, S. Jung, and K. Hong, “BTS-E: Audio Deepfake Detection Using Breathing-Talking-Silence Encoder,” in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/ICASSP49357.2023.10095927.

R. L. M. A. P. C. Wijethunga, D. M. K. Matheesha, A. Al Noman, K. H. V. T. A. De Silva, M. Tissera, and L. Rupasinghe, “Deepfake audio detection: A deep learning based solution for group conversations,” in ICAC 2020 - 2nd International Conference on Advancements in Computing, Proceedings, Institute of Electrical and Electronics Engineers Inc., Dec. 2020, pp. 192–197. doi: 10.1109/ICAC51239.2020.9357161.

A. Hamza et al., “Deepfake Audio Detection via MFCC features using Machine Learning,” IEEE Access, 2022, doi: 10.1109/ACCESS.2022.3231480.

G. Ulutas, G. Tahaoglu, and B. Ustubioglu, “Deepfake audio detection with vision transformer based method,” in 2023 46th International Conference on Telecommunications and Signal Processing, TSP 2023, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 244–247. doi: 10.1109/TSP59544.2023.10197715.

B. F. Nasar, T. Sajini, and E. R. Lason, “Deepfake Detection in Media Files - Audios, Images and Videos,” in 2020 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2020, Institute of Electrical and Electronics Engineers Inc., Dec. 2020, pp. 74–79. doi: 10.1109/RAICS51191.2020.9332516.

B. Kumar and S. R. Alraisi, “Deepfakes Audio Detection Techniques Using Deep Convolutional Neural Network,” in 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing, COM-IT-CON 2022, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 463–468. doi: 10.1109/COM-IT-CON54601.2022.9850771.

Z. Lv, S. Zhang, K. Tang, and P. Hu, “FAKE AUDIO DETECTION BASED ON UNSUPERVISED PRETRAINING MODELS,” in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 9231–9235. doi: 10.1109/ICASSP43922.2022.9747605.

M. Li and X. P. Zhang, “Robust Audio Anti-Spoofing System Based on Low-Frequency Sub-Band Information,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/WASPAA58266.2023.10248132.

M. Li, Y. Ahmadiadli, and X.-P. Zhang, “Robust Deepfake Audio Detection via Bi-Level Optimization,” Institute of Electrical and Electronics Engineers (IEEE), Dec. 2023, pp. 1–6. doi: 10.1109/mmsp59012.2023.10337724.

A. Khan and K. M. Malik, “Securing Voice Biometrics: One-Shot Learning Approach for Audio Deepfake Detection,” in 2023 IEEE International Workshop on Information Forensics and Security (WIFS), IEEE, Dec. 2023, pp. 1–6. doi: 10.1109/WIFS58808.2023.10374968.

P. Kawa, M. Plata, and P. Syga, “SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection,” in Proceedings - 2022 IEEE 21st International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2022, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 792–799. doi: 10.1109/TrustCom56396.2022.00111.

L. Wang, B. Yeoh, and J. W. Ng, “Synthetic Voice Detection and Audio Splicing Detection using SE-Res2Net-Conformer Architecture,” in 2022 13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 115–119. doi: 10.1109/ISCSLP57327.2022.10037999.

I. Altalahin, S. Alzu’Bi, A. Alqudah, and A. Mughaid, “Unmasking the Truth: A Deep Learning Approach to Detecting Deepfake Audio Through MFCC Features,” in 2023 International Conference on Information Technology: Cybersecurity Challenges for Sustainable Cities, ICIT 2023 - Proceeding, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 511–518. doi: 10.1109/ICIT58056.2023.10226172.

A. Khovrat and V. Kobziev, “Using Recurrent and Convulation Neural Networks to Indentify the Fake Audio Messages,” in 2023 IEEE 7th International Conference on Methods and Systems of Navigation and Motion Control, MSNMC 2023 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 174–177. doi: 10.1109/MSNMC61017.2023.10329236.

M. McUba, A. Singh, R. A. Ikuesan, and H. Venter, “The effect of deep learning methods on deepfake audio detection for digital investigation,” in Procedia Computer Science, Elsevier B.V., 2023, pp. 211–219. doi: 10.1016/j.procs.2023.01.283.

D. Salvi et al., “A Robust Approach to Multimodal Deepfake Detection,” J Imaging, vol. 9, no. 6, Jun. 2023, doi: 10.3390/jimaging9060122.

C. Doss et al., “Deepfakes and scientific knowledge dissemination,” Sci Rep, vol. 13, no. 1, Dec. 2023, doi: 10.1038/s41598-023-39944-3.

Members of APTLY lab, “Fake-or-Real Audio Dataset.” Accessed: Jan. 20, 2024. [Online]. Available: https://www.eecs.yorku.ca/~bil/Datasets/for-original.tar.gz

Downloads

Published

24.03.2024

How to Cite

Borade, S. ., Jain, N. ., Patel, B. ., Kumar, V. ., Godhrawala, M. ., Kolaskar, S. ., Nagare, Y. ., Shah, P. ., & Shah, J. . (2024). Improving Deepfake Audio Detection: A Support Vector Machine Approach with Mel-Frequency Cepstral Coefficients. International Journal of Intelligent Systems and Applications in Engineering, 12(18s), 281–291. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/4972

Issue

Section

Research Article

Most read articles by the same author(s)