Rolling in the Deep Convolutional Neural Networks

Keywords: Convolutional Neural Networks, Deep Learning, Image Processing


Over the past years, convolutional neural networks (CNNs) have achieved remarkable success in deep learning. The performance of CNN-based models has caused major advances in a wide range of tasks from computer vision to natural language processing. However, the exposition of the theoretical calculations behind the convolution operation is rarely emphasized. This study aims to provide better understanding the convolution operation entirely by means of diving into the theory of how backpropagation algorithm works for CNNs. In order to explain the training of CNNs clearly, the convolution operation on images is explained in detail and backpropagation in CNNs is highlighted. Besides, Labeled Faces in the Wild (LFW) dataset which is frequently used in face recognition applications is used to visualize what CNNs learn. The intermediate activations of a CNN trained on the LFW dataset are visualized to gain an insight about how CNNs perceive the world. Thus, the feature maps are interpreted visually as well, alongside the training process.


Download data is not yet available.


I. Goodfellow, Y. Bengio and A. Courville, Deep Learning. The MIT Press, 2016.

A. Geron, Hands-on Machine Learning with Scikit-Learn and Tensorflow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media, Inc., 2017.

Y. LeCun, Y. Bengio and G. Hinton, “Deep learning,” Nature, vol. 512, no. 7553, 2015.

Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.

A. Krizhevsky, I. Sutskever and G. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, pp. 1097-1105, 2012.

G. Huang, Z. Liu, L. Maaten and K. Weinberger, “Densely connected convolutional networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700-4708, 2017.

K. He, X. Zhang, S. Ren and J. Sun, “Deep residual learning for image recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, “Going deeper with convolutions,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.1-9, 2015.

F. Chollet, Deep Learning with Python. Manning Publications Co., 2018.

Z. Qin, F. Yu, C. Lıu and X. Chen, “How convolutional neural network see the world – A survey of convolutional neural network visualization methods,” Mathematical Foundations of Computing, vol. 1, no. 2, 2018.

M.D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” European Conference on Computer Vision, Springer, Cham, pp. 818-833, 2014.

A. Mahendran and A. Vedaldi, “Visualizing deep convolutional neural networks using natural pre-images,” International Journal of Computer Vision, vol.120, no. 3, pp. 233-255, 2016.

J. Yosinski, J. Clune, A. Nguyen, T. Fuchs and H. Lipson, “Understanding neural networks through deep visualization,” arXiv preprint arXiv:1506.06579, 2015.

Q. Zhang, Y.N. Wu and S. Zhu, “Interpretable convolutional neural networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8827-8836, 2018.

W. Samek, A. Binder, G. Montavon, S. Lapuschkin and K-R. Müller, “Evaluating the visualization of what a deep neural network has learned,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no.11, pp. 2660-2673, 2016.

A. Nguyen, J. Yosinski and J. Clune, “Understanding neural networks via feature visualization: A survey,” arXiv preprint arXiv:1904.08939, 2019.

K. Fukushima, “Cognitron: A self-organizing multilayered neural network,” Biological Cybernetics, vol. 20, pp. 121-136, 1975.

K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological Cybernetics, vol. 36, no. 4, pp. 193-202, 1980.

D. Hubel and T. Wiesel, “Receptive fields of single neurons in the cat’s striate cortex,” The Journal of Physiology, vol. 148, no. 3, pp. 574-591, 1959.

D. Hubel and T. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex,” The Journal of Physiology, vol. 160, no. 1, pp. 106-154, 1962.

D. Hubel and T. Wiesel, “Receptive fields and functional architecture of monkey striate cortex,” The Journal of Physiology, vol. 195, no. 1, pp. 215-243, 1968.

N. Nilsson, The Quest for Artificial Intelligence. Cambridge University Press, 2009.

G. Hinton, S. Osindero and Y. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527-1554, 2006.

C.M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.

D. Soydaner, “Training deep neural network based hyper autoencoders with machine learning methods,” Ph.D dissertation, Dept. Statistics, Mimar Sinan Fine Arts University, İstanbul, Turkey, 2018.

V. Dumoulin and F. Visin, “A guide to convolution arithmetic for deep learning,” arXiv preprint arXiv:1603.07285, 2016.

D. Rumelhart, G. Hinton and R. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, pp. 533-536, 1986.

P. Solai, “Convolutions and backpropagations,” Available:, 2018.

Y. Taigman, M. Yang, M. Ranzato and L. Wolf, “DeepFace: Closing the gap to human-level performance in face verification,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701-1708, 2014.

C. Garcia and M. Delakis, “Convolutional face finder: A neural architecture for fast and robust face detection,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1408-1423, 2004.

G. Huang, M. Ramesh, T. Berg and E. Learned-Miller, “Labeled faces in the wild: A database for studying face recognition in unconstrained environments,” University of Massachusetts, Amherst, Technical Report, pp. 07-49, 2007.

D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.

How to Cite
D. Soydaner, “Rolling in the Deep Convolutional Neural Networks”, IJISAE, vol. 7, no. 4, pp. 222-226, Dec. 2019.
Research Article