Sapling Dream
Collection
"Sapling Dream" is a series of GPT-models < 1B parameters. They achieve better performance than similar-sized models by reasoning.
•
2 items
•
Updated
•
1
Introducing SaplingDream, a compact GPT model with 0.5 billion parameters, based on the Qwen/Qwen2.5-0.5B-Instruct architecture. This model has been fine-tuned on a RTX4060 8GB for a bit over two days on ~0.3B tokens...
Evaluation Loss Chart
Base model
Qwen/Qwen2.5-0.5B