Efficient and Effective Architecture in Continual Learning Through Various ResNets
Keywords:
Catastrophic forgetting, Continual learning, ESPN, Machine learning, ResNetAbstract
Continual learning stands as a crucial component in advancing artificial intelligence, yet it encounters a significant challenge known as catastrophic forgetting. This phenomenon occurs when models lose previously acquired knowledge upon learning new tasks. While some methods propose partial remedies, the impact of altering the model's architecture on this forgetting remains largely unexplored. This study delves into Residual Networks (ResNets) to evaluate how modifications in depth, width, and connectivity influence the process of continual learning. By introducing a simplified design tailored specifically for continual learning, this research seeks to compare its efficiency against established ResNets. Through an in-depth exploration of the algorithm's configuration, the study aims to elucidate the underlying rationale behind its design decisions. Furthermore, it evaluates the performance of the proposed model using a diverse set of metrics, aiming to identify both strengths and areas for improvement. Ultimately, this research sheds light on how the architectural aspects of a model impact its learning capabilities over time, with the ultimate goal of fostering the development of AI systems capable of continuous learning without experiencing the detrimental effects of forgetting. Demonstrating accuracy levels ranging from 62.52% to 90.39% across various tasks, the proposed model showcases its effectiveness in real-world continual learning scenarios.
Downloads
References
Yin, Q.Y., Yang, J., Huang, K.Q., et al., "AI in Human-computer Gaming: Techniques, Challenges and Opportunities," Mach. Intell. Res., vol. 20, pp. 299–317, 2023. [Online]. Available: https://doi.org/10.1007/s11633-022-1384-6
O. Vinyals et al., “Grandmaster level in Starcraft II using multi-agent reinforcement learning,” Nature, vol. 575, no. 7782, pp. 350–354, 2019. [Online]. Available: https://doi.org/10.1038/s41586-019-1724-z
S. Thrun, “Lifelong learning algorithms,” in Learning to Learn, pp. 181–209, Springer, 1998.
M. McCloskey and N. J. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” Psychology of Learning and Motivation, vol. 24, pp. 109–165, 1989.
J. Yoon et al., “Lifelong learning with dynamically expandable networks,” in Sixth International Conference on Learning Representations (ICLR), 2018.
M. Delange et al., “A continual learning survey: Defying forgetting in classification tasks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021. [Online]. Available: https://doi.org/10.1109/TPAMI.2021.3057446
G. M. Van de Ven and A. S. Tolias, “Three scenarios for continual learning,” arXiv preprint arXiv:1904.07734, 2019. [Online]. Available: https://doi.org/10.48550/arXiv.1708.01547
K. He et al., “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016. [Online]. Available: https://doi.org/10.48550/arXiv.1512.03385
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, 2017. doi: 10.1145/3065386
C. Badue et al., “Self-driving cars: A survey,” CoRR, abs/1901.04407, 2019.
D. Silver et al., “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, 2016.
Y. Li et al., "Provable and Efficient Continual Representation Learning," arXiv:2203.02026v2 [cs.LG], 2022.
A. Mallya and S. Lazebnik, "Packnet: Adding multiple tasks to a single network by iterative pruning," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7765–7773, 2018. doi: 10.48550/arXiv.1711.05769
J. T. Wixted, et al., The psychology and neuroscience of forgetting, Annual review of psychology 55 (1) (2004) 235–269. [Online]. Available: https://doi.org/10.1146/annurev.psych.55.090902.141555
M. Ye, X. Zhang, P. C. Yuen, S.-F. Chang, Unsupervised embedding learning via invariant and spreading instance feature, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 6210–6219.
D. Lopez-Paz and M. Ranzato, "Gradient Episodic Memory for Continual Learning," in Advances in Neural Information Processing Systems, vol. 30, 2017, pp. 6467-6476.
D. Yin et al., "Optimization and generalization of regularization-based continual learning: a loss approximation viewpoint," arXiv preprint arXiv:2006.10974, 2020.
P. Buzzega, M. Boschini, A. Porrello, and S. Calderara, "Rethinking experience replay: a bag of tricks for continual learning," in Proc. 25th Int. Conf. Pattern Recognit. (ICPR), 2021, pp. 2180–2187.
J. Kirkpatrick et al., "Overcoming catastrophic forgetting in neural networks," Proc. Natl. Acad. Sci. U.S.A., vol. 114, no. 13, pp. 3521–3526, 2017.
C. Fernando et al., "Pathnet: Evolution channels gradient descent in super neural networks," arXiv preprint arXiv:1701.08734, 2017.
M. Wortsman et al., "Supermasks in superposition," arXiv preprint arXiv:2006.14769, 2020.
S.-A. Rebuffi, A. Kolesnikov, G. Sperl, C. H. Lampert, icarl: Incremental classifier and representation learning, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2017, pp. 5533–5542.
S. Hou, X. Pan, C. C. Loy, Z. Wang, D. Lin, Learning a unified classifier incrementally via rebalancing, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 831–839. doi: 10.48550/arXiv.1812.00420
A. Chaudhry, R. Marc’Aurelio, M. Rohrbach, M. Elhoseiny, Efficient lifelong learning with agem, in: 7th International Conference on Learning Representations, ICLR 2019, International Conference on Learning Representations, ICLR, 2019.
Y. Wu, Y. Chen, L. Wang, Y. Ye, Z. Liu, Y. Guo, Y. Fu, Large scale incremental learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 374–382.
Q. Pham, D. Sahoo, C. Liu, S. C. Hoi, Bilevel continual learning, arXiv preprint arXiv:2007.15553 (2020).
Y. Bengio, A. Courville, P. Vincent, Representation learning: A review and new perspectives, IEEE transactions on pattern analysis and machine intelligence 35 (8) (2013) 1798–1828. doi: 10.1109/TPAMI.2013.50. doi: 10.1109/TPAMI.2013.50
D. P. Kingma, M. Welling, Auto-encoding variational bayes, arXiv preprint arXiv:1312.6114 (2013).
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial networks, Communications of the ACM 63 (11) (2020) 139– 144.
Y. Wu, Y. Chen, L. Wang, Y. Ye, Z. Liu, Y. Guo, Y. Fu, Large scale incremental learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 374–382. doi: 10.1109/TPAMI.2013.50
H. Shin, J. K. Lee, J. Kim, J. Kim, Continual learning with deep generative replay, arXiv preprint arXiv:1705.08690 (2017).
O. Ostapenko, M. Puscas, T. Klein, P. Jahnichen, M. Nabi, Learning to remember: A synaptic plasticity driven framework for continual learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 11321–11329. doi: 10.1109/TPAMI.2013.50
F. Zenke, B. Poole, and S. Ganguli, "Continual learning through synaptic intelligence," in Proc. 34th Int. Conf. Mach. Learn., vol. 70, 2017, pp. 3987–3995. doi: 10.1109/TPAMI.2013.50
A. Chaudhry et al., "On tiny episodic memories in continual learning," arXiv preprint arXiv:1902.10486, 2019.
K. Cho, T. Raiko, and A. Ilin, "Enhanced gradient and adaptive learning rate for training restricted boltzmann machines," in Proc. 28th Int. Conf. Mach. Learn. (ICML), 2011, pp. 105–112. doi: 10.1109/TPAMI.2013.50
J. Serra, D. Suris, M. Miron, and A. Karatzoglou, "Overcoming catastrophic forgetting with hard attention to the task," in Int. Conf. Mach. Learn., 2018, pp. 4548–4557.
S. Lee, S. Behpour, and E. Eaton, "Sharing less is more: Lifelong learning in deep networks with selective layer transfer," in Proc. 38th Int. Conf. Mach. Learn., vol. 139, 2021, pp. 6065–6075.
A. Mallya, D. Davis, and S. Lazebnik, "Piggyback: Adapting a single network to multiple tasks by learning to mask weights," in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 67–82. doi: 10.1109/TPAMI.2013.50
X. Li et al., "Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting," in Proc. 36th Int. Conf. Mach. Learn. (ICML), vol. 97, 2019, pp. 3925–3934.
B. H. Y. Chow and C. C. Reyes-Aldasoro, "Automatic Gemstone Classification Using Computer Vision," Minerals, vol. 12, 2022, Art. no. 60. doi: 10.3390/min12010060.
J. S. Smith et al., "A Closer Look at Rehearsal-Free Continual Learning," in 2023 IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR) Workshop on Continual Learning. Comput. Vis. (CLVision 2023).
B. Zhou et al., "Learning deep features for discriminative localization," in 2016 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 2921–2929. doi: 10.1109/TPAMI.2013.50
M. Lin, Q. Chen and S. Yan, “Network In Network,” 2014 International Conference on Learning Representations (ICLR), Banff, AB, Canada, 2014.
P. Sermanet, S. Chintala and Y. LeCun, “Convolutional Neural Networks Applied to House Numbers Digit Classification,” 2012 21st International Conference on Pattern Recognition (ICPR), Tsukuba, Japan, 2012, pp. 3288-3291. doi:10.3390/min12010060
S. Zagoruyko and N. Komodakis, "Wide Residual Networks," in Proceedings of the British Machine Vision Conference (BMVC), 2016, doi: 10.5244/C.30.84
V. Nair and G. E. Hinton, "Rectified Linear Units Improve Restricted Boltzmann Machines," in Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML), Haifa, Israel, 2010, pp. 807-814.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.