This is a SmolLM2-350M-Instruct model fine-tuned on the Faroese portion of Fineweb-2. It is intended for my research and has not been evaluated more broadly yet.

Training:

5 Epochs
Learning rate: 5e-4
LR scheduler: Cosine
Warmup ratio: 0.05
Batch size: 1
4 A100 (80GB) GPUs
Gradient accumulation steps: 32
Effective batch size: 128
Max. context length: 8192 tokens

Downloads last month: 12

Safetensors

Model size

362M params

Tensor type

F32

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for jekunz/smollm-360m-cpt-fineweb-faroese

Base model

HuggingFaceTB/SmolLM2-360M-Instruct

Finetuned

(46)

this model

Dataset used to train jekunz/smollm-360m-cpt-fineweb-faroese

Collection including jekunz/smollm-360m-cpt-fineweb-faroese

SmolLM CPT

Collection

Continued Pre-Training of SmolLM models on the Fineweb-2 portions of Scandinavian languages. • 12 items • Updated 12 days ago