A Comprehensive Review of Deepfake Detection Pertaining to Images, Videos, Audio, and News using Deep Learning Techniques
Downloads
Deepfakes, which are synthetic media realistic in nature generated using artificial intelligence (AI); pose a significant threat to individuals and society. The rapid advancement of deepfake technology has led to the creation of highly realistic synthetic content covering images, videos, audio, and news. While deepfake applications offer creative possibilities, their misuse for misinformation, identity fraud, and cybersecurity threats necessitates robust detection methods. Deepfake crimes are rising daily, wherein deepfake media detection has become a big challenge and has high claim in digital forensics. This review explores the state-of-the-art deep learning (DL) techniques for deepfake detection of four parameters, namely images, videos, audio, and news. The ML approaches rely on handcrafted features but struggle with evolving deepfake methods. In contrast, DL techniques, such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks, have demonstrated superior detection accuracy by learning discriminative features. Even Recurrent Neural Networks (RNNs), and Transformer-based architectures like Bidirectional encoder representations from transformers (BERT), have demonstrated superior accuracy in identifying manipulated content. Furthermore, recent advancements such as Vision Transformers (ViTs) and Explainable AI (XAI) models are enhancing detection interpretability and robustness. This review highlights the future research directions for strengthening deepfake detection mechanisms. The rapid advancements in deepfake generation necessitate continuous research and development of countermeasures.
Agarwal, S., Singh, A., Singh, R. 2020. Detecting deepfake videos using frequency domain analysis. Journal of AI Research, 45(2), 112–126.
Ahmed S. Abdulredaa., Ahmed J. Obaida. 2022. A landscape view of deepfake techniques and detection methods. Int. J. Nonlinear Anal. Appl. 13, No. 1, 745-755.
http://dx.doi.org/10.22075/ijnaa.2022.5580
Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, Michael Auli. 2020. Wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. NeurIPS. NIPS'20: Proceedings of the 34th International Conference on Neural Information Processing Systems, Pages 12449 - 12460
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, Matthias Nießner. 2019. FaceForensics++: Learning to Detect Manipulated Facial Images”Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1-11.
Barredo Arrieta, A., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., Herrera, F. 2020. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82-115. https://doi.org/10.1016/j.inffus.2019.12.012
Bian, T., Xiao, X., Xu, T., Zhao, P., Huang, W., Rong, Y., Huang, J. 2020. Rumor detection on social media with bi-directional graph convolutional networks. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34, (01), pp. 549–556.
Borrelli, C., Bestagini, P., Antonacci, F., Sarti, A., Tubaro, S. 2021. Synthetic speech detection through short-term and long-term prediction traces. EURASIP Journal on Information Security, (1), 1–14.
Chen, B., Tan, S. 2021. FeatureTransfer: Unsupervised domain adaptation for cross-domain deepfake detection. Security and Communication Networks, 2021, 1–8.
Chollet, F. 2017. Xception: Deep learning with depthwise separable convolutions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2551–2566. https://doi.org/10.1109/TPAMI.2016.2626660
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K., 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. pp. 4171–4186.
Digvijay Yadav, Sakina Salmani. 2019. Deepfake: A Survey on Facial Forgery Technique Using Generative Adversarial Network, Proceedings of the International Conference on Intelligent Computing and Control Systems (ICICCS 2019). IEEE Xplore Part Number: CFP19K34-ART; ISBN: 978-1-5386-8113-8.
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al., 2020. An image is worth 16×16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations.
Ghorbanpour, F., Ramezani, M., Fazli, M.A., Rabiee, H.R. 2021. FNR: A similarity and transformer-based approachto detect multi-modal FakeNews in social media. arXiv preprint arXiv:2112.01131.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y. 2014. Generative adversarial nets. Advances in neural information processing systems, 27.
Guarnera, L., Giudice, O., Battiato, S. 2020. Fighting deepfake by exposing the convolutional traces on images. IEEE Access, 8, 165085–165098.
Güera, D., Delp, E. J. 2018. Deepfake video detection using recurrent neural networks. Paper presented at the 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).
Guo, J., Liu, X., Zhang, D. 2021. Recurrent neural networks for deepfake video detection. Neural Computing and Applications, 33(8), 4201–4215. https://doi.org/10.1007/s00521-020-05487-7
Hemlata Tak, Jose Patino, Massimiliano Todisco, Andreas Nautsch, Nicholas Evans, and Anthony Larcher. 2021. End-to-End Audio Deepfake Detection: Waveform or Spectrogram? Interspeech.
Nasir, J.A., Khan O.S., Varlamis. I. 2021. Fake news detection: A hybrid CNN-RNN based deep learning approach. International Journal of Information Management Data Insights 1, 100007
Jaiwanth Reddy, Shikha Mundra, Ankit Mundra. 2024. Ensembling Deep Learning Models for Fake News Classification. Procedia Computer Science 235, 2766–2774.
Jia, Y., Zhang, Y., Weiss, R., Wang, Q., Shen, J., Ren, F., Nguyen, P., Pang, R., Lopez Moreno, I. and Wu, Y. 2018. Transfer learning from speaker verification to multispeaker text-to-speech synthesis. Advances in neural information processing systems, 31.
Jiangfeng Zeng, Yin Zhang, Xiao Ma. 2021. Fake news detection for epidemic emergencies via deep correlations between text and images. Sustainable Cities and Society 66, 102652
Jin, Z., Cao, J., Guo, H., Zhang, Y., Luo, J., 2017. Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In: Proceedings of the 25th ACM International Conference on Multimedia. pp. 795–816.
Jung, T., Kim, S., Kim, K. 2020. DeepVision: Deepfakes detection using human eye blinking pattern. IEEE Access, 8, 83144–83154.
Kadek Sastrawan, I.P.A. Bayupati, Dewa Made Sri Arsa. 2022. Detection of fake news using deep learning CNN–RNN based methods, ICT Express, Volume 8, Issue 3, 2022, 396-408, https://doi.org/10.1016/j.icte.2021.10.003.
Kai Shu, A Amy Sliva, Suhang Wang, Jiliang Tang, Huan Liu. 2017. Fake News Detection on Social Media: A Data Mining Perspective V 19, 22-36. https://doi.org/10.1145/3137597.3137600
Karandikar, A., Deshpande, V., Singh, S., Nagbhidkar, S., & Agrawal, S. 2020. Deepfake video detection using convolutional neural network. International Journal of Advanced Trends in Computer Science and Engineering, 9(2), 1311–1315.
Khattar, D., Goud, J.S., Gupta, M., Varma, V., 2019. Mvae: Multimodal variational autoencoder for fake news detection. In: Proceedings of the International World Wide Web Conferences. pp. 2915–2921.
Kohli, A., Gupta, A. 2021. Detecting DeepFake, FaceSwap and Face2Face facial forgeries using frequency CNN. Multimedia Tools and Applications, 80(12), 18461–18478.
Lee, S., Tariq, S., Shin, Y., Woo, S. S. 2021. Detecting handcrafted facial image manipulations and GAN-generated facial images using shallow-FakeFaceNet. Applied Soft Computing, 105, 107256.
Li, Y., Lyu, S. 2018. Exposing deepfake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656.
Li, Y., Li, H., Wang, X. 2019. Exposing deepfake videos by detecting face warping artifacts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 11960–11969).
Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B.J., Wong, K.F., Cha, M. 2016. Detecting rumors from microblogs with recurrent neural networks. In: Proceedings of the International Joint Conference on Artificial Intelligence. pp. 3818–3824.
Mittal, T., Bhattacharya, U., Chandra, R., Bera, A., Manocha, D. 2020. Emotions don't lie: An audio-visual deepfake detection method using affective cues. Paper presented at the Proceedings of the 28th ACM International Conference on Multimedia.
Nguyen, X. H., Tran, T. S., Nguyen, K. D., Truong, D. T. 2021. Learning spatio-temporal features to detect manipulated facial videos created by the Deepfake techniques. Forensic Science International: Digital Investigation, 36, 301108.
Oord, A.V.D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A. and Kavukcuoglu, K., 2016. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499.
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., and Ilya Sutskever. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning (pp. 8748-8763). PmLR.
Rai, A., Singh, A., Singh, R. 2022. Explainable AI in deepfake detection: Challenges and opportunities. AI Ethics, 5(3), 199–215. https://doi.org/10.1007/s43681-022-00160-7
Rubin, V. L., Conroy, N. J., Chen, Y. & Cornwell, S. (2016). Fake News or Truth? Using Satirical Cues to Detect Potentially Misleading News. Proceedings of NAACL-HLT, p. 7–17.
Shi, B., Hsu, W. N., Lakhotia, K., Mohamed. A. 2022. AV-HuBERT: Self-Supervised Speech Representation Learning by Audio-Visual Multi-Modal Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Tolosana, R., Vera-Rodriguez, R., Fierrez, J. 2020. Deepfakes and beyond: A survey of face manipulation and fake detection. Information Fusion, 64, 131–148. https://doi.org/10.1016/j.inffus.2020.07.004
Wang, R., Juefei-Xu, F., Huang, Y., Guo, Q., Xie, X., Ma, L., Liu, Y. 2020. Deepsonar: Towards effective and robust detection of AIsynthesized fake voices. Paper presented at the Proceedings of the 28th ACM International Conference on Multimedia. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499.
Wijethunga, R., Matheesha, D., Al Noman, A., De Silva, K., Tissera, M., & Rupasinghe, L. 2020. Deepfake audio detection: A deep learning based solution for group conversations. Paper presented at the 2020 2nd International Conference on Advancements in Computing (ICAC).
Xiang Zhang, Junbo Jake Zhao, Yann LeCun. 2015. Character-level convolutional networks for text classification. In Proc. NeurIPS, 649–657.
Xu, Z., Liu, J., Lu, W., Xu, B., Zhao, X., Li, B., Huang, J. 2021. Detecting facial manipulated videos based on set convolutional neural networks. Journal of Visual Communication and Image Representation, 77, 103119.
Xue, J., Wang, Y., Tian, Y., Li, Y., Shi, L., Wei, L., 2021. Detecting fake news by exploring the consistency of multimodal data. Inf. Process. Manage. 58 (5), 102610
Yang, C.-Z., Ma, J., Wang, S.-L., Liew, A. W.-C. 2020. Preventing deepFake attacks on speaker authentication by dynamic lip movement analysis. IEEE Transactions on Information Forensics and Security, 16, 1841–1854.
Yang, J., Xiao, S., Li, A., Lan, G., Wang, H. 2021. Detecting fake images by identifying potential texture difference. Future Generation Computer Systems, 125, 127–135.
Yu, F., Liu, Q., Wu, S., Wang, L., Tan, T. 2017. A convolutional approach for misinformation identification. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. pp. 3901–3907.
Yu, P., Xia, Z., Fei, J., Lu, Y. 2021. A survey on Deepfake video detection. IET Biometrics, 10(6) 607–624.
Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y., Farhadi, A., Roesner, F., Choi, Y. 2019. Defending against neural fake news. Advances in neural information processing systems, 32.
Zhang, W., Zhao, C., Li, Y. 2020. A novel counterfeit feature extraction technique for exposing face-swap images based on deep learning and error level analysis. Entropy, 22(2), 249.