Models and evaluation data for our 2025 COLING paper (https://aclanthology.org/2025.coling-main.404/).
Bastian Bunzeck
bbunzeck
AI & ML interests
Cognitive and usage-based approaches to language modeling, language acquisition in humans and machines, small and efficient language models
Recent Activity
updated
a collection
21 days ago
Small Language Models Also Work With Small Vocabularies
updated
a collection
21 days ago
Small Language Models Also Work With Small Vocabularies
updated
a collection
21 days ago
Small Language Models Also Work With Small Vocabularies
Organizations
Collections
3
Papers
1
models
14
bbunzeck/gpt-wee-large-curriculum
Text Generation
•
Updated
•
116
bbunzeck/gpt-wee-large
Text Generation
•
Updated
•
177
bbunzeck/gpt-wee-small-curriculum
Text Generation
•
Updated
•
115
bbunzeck/gpt-wee-small
Text Generation
•
Updated
•
115
bbunzeck/gpt-wee-medium
Text Generation
•
Updated
•
167
bbunzeck/tweenie_llama
Text Generation
•
Updated
•
123
bbunzeck/weenie_llama
Text Generation
•
Updated
•
166
bbunzeck/teenie_llama
Text Generation
•
Updated
•
125
bbunzeck/baby_llama
Text Generation
•
Updated
•
163
bbunzeck/phoneme-llama-no-whitespace
Text Generation
•
Updated
•
1.96k
datasets
8
bbunzeck/rhyme-sentences
Viewer
•
Updated
•
400
•
16
bbunzeck/wug-words
Viewer
•
Updated
•
1k
•
14
bbunzeck/phoneme-babylm-100M
Viewer
•
Updated
•
15.8M
•
43
bbunzeck/phoneme-blimp
Viewer
•
Updated
•
59.9k
•
24
bbunzeck/phoneme-babylm-10M
Viewer
•
Updated
•
3.92M
•
25
bbunzeck/minisiqa
Viewer
•
Updated
•
1.39k
•
40
bbunzeck/slayqa
Viewer
•
Updated
•
1.39k
•
40
bbunzeck/noslayqa
Viewer
•
Updated
•
1.39k
•
37