Efficient and Effective Architecture in Continual Learning Through Various ResNets

Authors

  • Sachin Gaur, Rahul Pandey

Keywords:

Catastrophic forgetting, Continual learning, ESPN, Machine learning, ResNet

Abstract

Continual learning stands as a crucial component in advancing artificial intelligence, yet it encounters a significant challenge known as catastrophic forgetting. This phenomenon occurs when models lose previously acquired knowledge upon learning new tasks. While some methods propose partial remedies, the impact of altering the model's architecture on this forgetting remains largely unexplored. This study delves into Residual Networks (ResNets) to evaluate how modifications in depth, width, and connectivity influence the process of continual learning. By introducing a simplified design tailored specifically for continual learning, this research seeks to compare its efficiency against established ResNets. Through an in-depth exploration of the algorithm's configuration, the study aims to elucidate the underlying rationale behind its design decisions. Furthermore, it evaluates the performance of the proposed model using a diverse set of metrics, aiming to identify both strengths and areas for improvement. Ultimately, this research sheds light on how the architectural aspects of a model impact its learning capabilities over time, with the ultimate goal of fostering the development of AI systems capable of continuous learning without experiencing the detrimental effects of forgetting. Demonstrating accuracy levels ranging from 62.52% to 90.39% across various tasks, the proposed model showcases its effectiveness in real-world continual learning scenarios.

Downloads

Download data is not yet available.

References

Yin, Q.Y., Yang, J., Huang, K.Q., et al., "AI in Human-computer Gaming: Techniques, Challenges and Opportunities," Mach. Intell. Res., vol. 20, pp. 299–317, 2023. [Online]. Available: https://doi.org/10.1007/s11633-022-1384-6

O. Vinyals et al., “Grandmaster level in Starcraft II using multi-agent reinforcement learning,” Nature, vol. 575, no. 7782, pp. 350–354, 2019. [Online]. Available: https://doi.org/10.1038/s41586-019-1724-z

S. Thrun, “Lifelong learning algorithms,” in Learning to Learn, pp. 181–209, Springer, 1998.

M. McCloskey and N. J. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” Psychology of Learning and Motivation, vol. 24, pp. 109–165, 1989.

J. Yoon et al., “Lifelong learning with dynamically expandable networks,” in Sixth International Conference on Learning Representations (ICLR), 2018.

M. Delange et al., “A continual learning survey: Defying forgetting in classification tasks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021. [Online]. Available: https://doi.org/10.1109/TPAMI.2021.3057446

G. M. Van de Ven and A. S. Tolias, “Three scenarios for continual learning,” arXiv preprint arXiv:1904.07734, 2019. [Online]. Available: https://doi.org/10.48550/arXiv.1708.01547

K. He et al., “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016. [Online]. Available: https://doi.org/10.48550/arXiv.1512.03385

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, 2017. doi: 10.1145/3065386

C. Badue et al., “Self-driving cars: A survey,” CoRR, abs/1901.04407, 2019.

D. Silver et al., “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, 2016.

Y. Li et al., "Provable and Efficient Continual Representation Learning," arXiv:2203.02026v2 [cs.LG], 2022.

A. Mallya and S. Lazebnik, "Packnet: Adding multiple tasks to a single network by iterative pruning," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7765–7773, 2018. doi: 10.48550/arXiv.1711.05769

J. T. Wixted, et al., The psychology and neuroscience of forgetting, Annual review of psychology 55 (1) (2004) 235–269. [Online]. Available: https://doi.org/10.1146/annurev.psych.55.090902.141555

M. Ye, X. Zhang, P. C. Yuen, S.-F. Chang, Unsupervised embedding learning via invariant and spreading instance feature, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 6210–6219.

D. Lopez-Paz and M. Ranzato, "Gradient Episodic Memory for Continual Learning," in Advances in Neural Information Processing Systems, vol. 30, 2017, pp. 6467-6476.

D. Yin et al., "Optimization and generalization of regularization-based continual learning: a loss approximation viewpoint," arXiv preprint arXiv:2006.10974, 2020.

P. Buzzega, M. Boschini, A. Porrello, and S. Calderara, "Rethinking experience replay: a bag of tricks for continual learning," in Proc. 25th Int. Conf. Pattern Recognit. (ICPR), 2021, pp. 2180–2187.

J. Kirkpatrick et al., "Overcoming catastrophic forgetting in neural networks," Proc. Natl. Acad. Sci. U.S.A., vol. 114, no. 13, pp. 3521–3526, 2017.

C. Fernando et al., "Pathnet: Evolution channels gradient descent in super neural networks," arXiv preprint arXiv:1701.08734, 2017.

M. Wortsman et al., "Supermasks in superposition," arXiv preprint arXiv:2006.14769, 2020.

S.-A. Rebuffi, A. Kolesnikov, G. Sperl, C. H. Lampert, icarl: Incremental classifier and representation learning, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2017, pp. 5533–5542.

S. Hou, X. Pan, C. C. Loy, Z. Wang, D. Lin, Learning a unified classifier incrementally via rebalancing, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 831–839. doi: 10.48550/arXiv.1812.00420

A. Chaudhry, R. Marc’Aurelio, M. Rohrbach, M. Elhoseiny, Efficient lifelong learning with agem, in: 7th International Conference on Learning Representations, ICLR 2019, International Conference on Learning Representations, ICLR, 2019.

Y. Wu, Y. Chen, L. Wang, Y. Ye, Z. Liu, Y. Guo, Y. Fu, Large scale incremental learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 374–382.

Q. Pham, D. Sahoo, C. Liu, S. C. Hoi, Bilevel continual learning, arXiv preprint arXiv:2007.15553 (2020).

Y. Bengio, A. Courville, P. Vincent, Representation learning: A review and new perspectives, IEEE transactions on pattern analysis and machine intelligence 35 (8) (2013) 1798–1828. doi: 10.1109/TPAMI.2013.50. doi: 10.1109/TPAMI.2013.50

D. P. Kingma, M. Welling, Auto-encoding variational bayes, arXiv preprint arXiv:1312.6114 (2013).

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial networks, Communications of the ACM 63 (11) (2020) 139– 144.

Y. Wu, Y. Chen, L. Wang, Y. Ye, Z. Liu, Y. Guo, Y. Fu, Large scale incremental learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 374–382. doi: 10.1109/TPAMI.2013.50

H. Shin, J. K. Lee, J. Kim, J. Kim, Continual learning with deep generative replay, arXiv preprint arXiv:1705.08690 (2017).

O. Ostapenko, M. Puscas, T. Klein, P. Jahnichen, M. Nabi, Learning to remember: A synaptic plasticity driven framework for continual learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 11321–11329. doi: 10.1109/TPAMI.2013.50

F. Zenke, B. Poole, and S. Ganguli, "Continual learning through synaptic intelligence," in Proc. 34th Int. Conf. Mach. Learn., vol. 70, 2017, pp. 3987–3995. doi: 10.1109/TPAMI.2013.50

A. Chaudhry et al., "On tiny episodic memories in continual learning," arXiv preprint arXiv:1902.10486, 2019.

K. Cho, T. Raiko, and A. Ilin, "Enhanced gradient and adaptive learning rate for training restricted boltzmann machines," in Proc. 28th Int. Conf. Mach. Learn. (ICML), 2011, pp. 105–112. doi: 10.1109/TPAMI.2013.50

J. Serra, D. Suris, M. Miron, and A. Karatzoglou, "Overcoming catastrophic forgetting with hard attention to the task," in Int. Conf. Mach. Learn., 2018, pp. 4548–4557.

S. Lee, S. Behpour, and E. Eaton, "Sharing less is more: Lifelong learning in deep networks with selective layer transfer," in Proc. 38th Int. Conf. Mach. Learn., vol. 139, 2021, pp. 6065–6075.

A. Mallya, D. Davis, and S. Lazebnik, "Piggyback: Adapting a single network to multiple tasks by learning to mask weights," in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 67–82. doi: 10.1109/TPAMI.2013.50

X. Li et al., "Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting," in Proc. 36th Int. Conf. Mach. Learn. (ICML), vol. 97, 2019, pp. 3925–3934.

B. H. Y. Chow and C. C. Reyes-Aldasoro, "Automatic Gemstone Classification Using Computer Vision," Minerals, vol. 12, 2022, Art. no. 60. doi: 10.3390/min12010060.

J. S. Smith et al., "A Closer Look at Rehearsal-Free Continual Learning," in 2023 IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR) Workshop on Continual Learning. Comput. Vis. (CLVision 2023).

B. Zhou et al., "Learning deep features for discriminative localization," in 2016 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 2921–2929. doi: 10.1109/TPAMI.2013.50

M. Lin, Q. Chen and S. Yan, “Network In Network,” 2014 International Conference on Learning Representations (ICLR), Banff, AB, Canada, 2014.

P. Sermanet, S. Chintala and Y. LeCun, “Convolutional Neural Networks Applied to House Numbers Digit Classification,” 2012 21st International Conference on Pattern Recognition (ICPR), Tsukuba, Japan, 2012, pp. 3288-3291. doi:10.3390/min12010060

S. Zagoruyko and N. Komodakis, "Wide Residual Networks," in Proceedings of the British Machine Vision Conference (BMVC), 2016, doi: 10.5244/C.30.84

V. Nair and G. E. Hinton, "Rectified Linear Units Improve Restricted Boltzmann Machines," in Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML), Haifa, Israel, 2010, pp. 807-814.

Downloads

Published

16.03.2024

How to Cite

Rahul Pandey, S. G. . (2024). Efficient and Effective Architecture in Continual Learning Through Various ResNets. International Journal of Intelligent Systems and Applications in Engineering, 12(3), 1148–1161. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5394

Issue

Section

Research Article