TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models

Peng, Liangzu; Elenter, Juan; Agterberg, Joshua; Ribeiro, Alejandro; Vidal, Rene

TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models

Part of International Conference on Representation Learning 2025 (ICLR 2025) Conference

Bibtex Paper

Authors

Liangzu Peng, Juan Elenter, Joshua Agterberg, Alejandro Ribeiro, Rene Vidal

Abstract

The goal of continual learning (CL) is to train a model that can solve multipletasks presented sequentially. Recent CL approaches have achieved strong performanceby leveraging large pre-trained models that generalize well to downstreamtasks. However, such methods lack theoretical guarantees, making them prone tounexpected failures. Conversely, principled CL approaches often fail to achievecompetitive performance. In this work, we aim to bridge this gap between theoryand practice by designing a simple CL method that is theoretically sound andhighly performant. Specifically, we lift pre-trained features into a higher dimensionalspace and formulate an over-parametrized minimum-norm least-squaresproblem. We find that the lifted features are highly ill-conditioned, potentiallyleading to large training errors (numerical instability) and increased generalizationerrors. We address these challenges by continually truncating the singular valuedecomposition (SVD) of the lifted features. Our approach, termed TSVD, is stablewith respect to the choice of hyperparameters, can handle hundreds of tasks, andoutperforms state-of-the-art CL methods on multiple datasets. Importantly, ourmethod satisfies a recurrence relation throughout its continual learning process,which allows us to prove it maintains small training and generalization errors byappropriately truncating a fraction of SVD factors. This results in a stable continuallearning method with strong empirical performance and theoretical guarantees.Code available: https://github.com/liangzu/tsvd.

TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models

Authors

Abstract

Name Change Policy