Analyzing the Effect of Different Activation Functions in Deep Learning on Accuracy and Execution time

Authors

  • Mahesh D. Titiya, Arjun V. Bala, Sheshang Degadwala

Keywords:

activation functions, neural network, deep learning, sigmoid, types of ReLU, softsign, tanh

Abstract

Activation functions is critical in specifying the active node within neural networks. Choosing the most suitable activation function is crucial because to its impact on the overall output of the network. Prior to choosing an activation function, it is essential to check the characteristics of each activation function based on our specific needs. The monotonicity, derivatives and range of the activation function are important characteristics. In our review study, we examined 13 different activation functions, such as ReLU, Linear, Exponential Linear Unit, Gaussian Error Linear Unit, Sigmoid, SoftPlus, among others.

Downloads

Download data is not yet available.

References

J. A. Hertz, Introduction to the theory of neural computation . CRC Press, 2018.

L. Deng, “A tutorial survey of architectures, algorithms, and applications for deep learning,” APSIPA Transactions on Signal and Information Processing , vol. 3, p. e2, 2014.

1989K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Computer Vision and Pattern Recognition (CVPR) , vol. 7, Dec. 2015.

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Computation , vol. 1, no. 4, pp. 541–551, Dec..

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Computer vision and pattern recognition (cvpr) , 2015, pp. 1–17.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” CoRR , vol. abs/1409.1556, 2014.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proceedings of the 25th international conference on neural information processing systems - volume 1 , 2012, pp. 1097–1105.

G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger, “Deep networks with stochastic depth.” in ECCV (4) , 2016, vol. 9908, pp. 646–661.

C. Y. M. Z. Alom T. M. Taha and V. K. Asari, “The history began from alexnet: A comprehensive survey on deep learning approaches,” arXiV , Dec. 2018.

K. J. Piczak, “Recognizing bird species in audio recordings using deep convolutional neural networks.” in CLEF (working notes) , 2016, pp. 534–543.

M. A. Nielsen, Neural networks and deep learning . Determination Press, 2015.

H. Robbins and S. Monro, “A stochastic approximation method,” Ann. Math. Statist. , vol. 22, no. 3, pp. 400–407, Sep. 1951.

A. Banerjee, A. Dubey, A. Menon, S. Nanda, and G. C. Nandi, “Speaker recognition using deep belief networks,” arXiv preprint arXiv:1805.08865 , 2018.

R. H. Byrd, S. L. Hansen, J. Nocedal, and Y. Singer, “A stochastic quasi-newton method for large-scale optimization,” S IAM Journal on Optimization , vol. 26, no. 2, pp. 1008–1031, 2016.

93Y. LeCun, L. Bottou, G. B. Orr, and K. R. Müller, “Efficient backprop,” in Neural networks: Tricks of the trade , Berlin, Heidelberg: Springer Berlin Heidelberg, 1998, pp. 9–50.

R. Hecht-Nielsen, “Theory of the backpropagation neural network,” in Neural networks for perception , H. Wechsler, Ed. Academic Press, 1992, pp. 65–.

K. Hara, H. Kataoka, and Y. Satoh, “Learning spatio-temporal features with 3D residual networks for action recognition,” in Proceedings of the ieee international conference on computer vision , 2017, pp. 3154–3160.

K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proceedings of the ieee international conference on computer vision , 2015, pp. 1026–1034.

R. M. Neal, “Connectionist learning of belief networks,” Artif. Intell. , vol. 56, no. 1, pp. 71–113, Jul. 1992.

L. B. Godfrey and M. S. Gashler, “A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks,” in 7th international conference on knowledge discovery and information retrieval , 2015, pp. 481–486.

W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi, “A survey of deep neural network architectures and their applications,” Neurocomputing , vol. 234, pp. 11–26, 2017.

A. Karpathy, “Yes you should understand backprop.” https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b, 2016.

Zihan Ding, Hao Dong. "Chapter 13 Learning to Run" , Springer Science and Business Media LLC, 2020.

Szandała, Tomasz. "Review and comparison of commonly used activation functions for deep neural networks." Bio-inspired neurocomputing. Springer, Singapore, 2021. 203-224.

Downloads

Published

20.06.2024

How to Cite

Mahesh D. Titiya. (2024). Analyzing the Effect of Different Activation Functions in Deep Learning on Accuracy and Execution time. International Journal of Intelligent Systems and Applications in Engineering, 12(4), 742 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6280

Issue

Section

Research Article