Analyzing the Effect of Different Activation Functions in Deep Learning on Accuracy and Execution time
Keywords:
activation functions, neural network, deep learning, sigmoid, types of ReLU, softsign, tanhAbstract
Activation functions is critical in specifying the active node within neural networks. Choosing the most suitable activation function is crucial because to its impact on the overall output of the network. Prior to choosing an activation function, it is essential to check the characteristics of each activation function based on our specific needs. The monotonicity, derivatives and range of the activation function are important characteristics. In our review study, we examined 13 different activation functions, such as ReLU, Linear, Exponential Linear Unit, Gaussian Error Linear Unit, Sigmoid, SoftPlus, among others.
Downloads
References
J. A. Hertz, Introduction to the theory of neural computation . CRC Press, 2018.
L. Deng, “A tutorial survey of architectures, algorithms, and applications for deep learning,” APSIPA Transactions on Signal and Information Processing , vol. 3, p. e2, 2014.
1989K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Computer Vision and Pattern Recognition (CVPR) , vol. 7, Dec. 2015.
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Computation , vol. 1, no. 4, pp. 541–551, Dec..
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Computer vision and pattern recognition (cvpr) , 2015, pp. 1–17.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” CoRR , vol. abs/1409.1556, 2014.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proceedings of the 25th international conference on neural information processing systems - volume 1 , 2012, pp. 1097–1105.
G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger, “Deep networks with stochastic depth.” in ECCV (4) , 2016, vol. 9908, pp. 646–661.
C. Y. M. Z. Alom T. M. Taha and V. K. Asari, “The history began from alexnet: A comprehensive survey on deep learning approaches,” arXiV , Dec. 2018.
K. J. Piczak, “Recognizing bird species in audio recordings using deep convolutional neural networks.” in CLEF (working notes) , 2016, pp. 534–543.
M. A. Nielsen, Neural networks and deep learning . Determination Press, 2015.
H. Robbins and S. Monro, “A stochastic approximation method,” Ann. Math. Statist. , vol. 22, no. 3, pp. 400–407, Sep. 1951.
A. Banerjee, A. Dubey, A. Menon, S. Nanda, and G. C. Nandi, “Speaker recognition using deep belief networks,” arXiv preprint arXiv:1805.08865 , 2018.
R. H. Byrd, S. L. Hansen, J. Nocedal, and Y. Singer, “A stochastic quasi-newton method for large-scale optimization,” S IAM Journal on Optimization , vol. 26, no. 2, pp. 1008–1031, 2016.
93Y. LeCun, L. Bottou, G. B. Orr, and K. R. Müller, “Efficient backprop,” in Neural networks: Tricks of the trade , Berlin, Heidelberg: Springer Berlin Heidelberg, 1998, pp. 9–50.
R. Hecht-Nielsen, “Theory of the backpropagation neural network,” in Neural networks for perception , H. Wechsler, Ed. Academic Press, 1992, pp. 65–.
K. Hara, H. Kataoka, and Y. Satoh, “Learning spatio-temporal features with 3D residual networks for action recognition,” in Proceedings of the ieee international conference on computer vision , 2017, pp. 3154–3160.
K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proceedings of the ieee international conference on computer vision , 2015, pp. 1026–1034.
R. M. Neal, “Connectionist learning of belief networks,” Artif. Intell. , vol. 56, no. 1, pp. 71–113, Jul. 1992.
L. B. Godfrey and M. S. Gashler, “A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks,” in 7th international conference on knowledge discovery and information retrieval , 2015, pp. 481–486.
W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi, “A survey of deep neural network architectures and their applications,” Neurocomputing , vol. 234, pp. 11–26, 2017.
A. Karpathy, “Yes you should understand backprop.” https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b, 2016.
Zihan Ding, Hao Dong. "Chapter 13 Learning to Run" , Springer Science and Business Media LLC, 2020.
Szandała, Tomasz. "Review and comparison of commonly used activation functions for deep neural networks." Bio-inspired neurocomputing. Springer, Singapore, 2021. 203-224.
Downloads
Published
How to Cite
Issue
Section
License
![Creative Commons License](http://i.creativecommons.org/l/by-sa/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.