Papers
arxiv:2402.01364

Continual Learning for Large Language Models: A Survey

Published on Feb 2, 2024
Authors:
,
,
,
,
,

Abstract

Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale. However, updates are necessary to endow LLMs with new skills and keep them up-to-date with rapidly evolving human knowledge. This paper surveys recent works on <PRE_TAG>continual learning</POST_TAG> for LLMs. Due to the unique nature of LLMs, we catalog continue learning techniques in a novel multi-staged categorization scheme, involving continual pretraining, instruction tuning, and alignment. We contrast <PRE_TAG>continual learning</POST_TAG> for LLMs with simpler adaptation methods used in smaller models, as well as with other enhancement strategies like <PRE_TAG>retrieval-augmented generation</POST_TAG> and model editing. Moreover, informed by a discussion of benchmarks and evaluation, we identify several challenges and future work directions for this crucial task.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2402.01364 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2402.01364 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2402.01364 in a Space README.md to link it from this page.

Collections including this paper 1