Acquiring Bidirectionality via Large and Small Language Models

This model is LLM for obtaining token-level representations as proposed in our COLING 2025 paper "Acquiring Bidirectionality via Large and Small Language Models." Using token representation from bidirectional language models (LMs) such as BERT is still a widely used approach for token-classification tasks. Even though there exist much larger unidirectional LMs such as Llama-2, they are rarely used to replace the token representation of bidirectional LMs. We propose to newly train a small backward LM and concatenate its representations to those of an existing LM for downstream tasks.

This model is the "small backward LM" and it needs to be combined with another forward LM such as GPT-2. Please refer to our official repository to use this model. This particular model uses GPT2 vocabulary for its training.