Model Card for TrelisSmolLM-instruct

This model is a fine-tuned version of TrelisSmolLM-base, optimized for instruction following and conversational tasks using the WebInstructSub dataset.

To purchase the training scripts used for this model, visit: https://trelis.com/advanced-fine-tuning-scripts/

Model Details

Model Description

TrelisLM-80M-SFT is an 80 million parameter language model derived from SmolLM-360M through pruning and distillation, and then fine-tuned on the WebInstructSub dataset for improved instruction following capabilities.

Developed by: Trelis AI
Model type: Causal Language Model
Language(s): English
License: [More Information Needed]
Finetuned from model: Trelis/80M-0.0090-cosmopedia

Model Sources

Repository: https://huggingface.co./Trelis/80M-2percent-corpus-SFT

Uses

Direct Use

This model is designed for instruction following and conversational tasks. It can be used for:

Generating responses to user prompts or questions
Engaging in task-oriented dialogues
Assisting with general language understanding and generation tasks

Out-of-Scope Use

This model should not be used for:

Production systems without thorough testing and evaluation
Tasks requiring domain-specific expertise without additional fine-tuning
Any applications where errors could lead to harmful consequences

Training Details

Training Data

The model was fine-tuned on the TIGER-Lab/WebInstructSub dataset, which consists of instruction-response pairs. The training process used:

50,000 initial rows for the main training phase
10,000 additional rows for an annealing phase
10,000 randomly selected rows for evaluation

Training Procedure

Preprocessing: The dataset was formatted into a conversational structure with user and assistant messages.
Training type: Supervised Fine-Tuning (SFT)
Training regime: BFloat16 mixed precision

Training Hyperparameters

Batch size: 8
Gradient Accumulation steps: 4
Learning rate: 1e-3
Number of epochs: 1
Max sequence length: 2048
Warmup steps: 20

The training used a custom learning rate scheduler with an initial constant phase followed by cosine annealing.

Software and Hardware

Software: Transformers, TRL (Transformer Reinforcement Learning), Accelerate
Hardware: [More Information Needed]

Evaluation

Evaluation was performed on a randomly selected subset of 10,000 rows from the WebInstructSub dataset.

Metrics

[More Information Needed]

Limitations and Bias

As this model is fine-tuned on the WebInstructSub dataset, it may inherit biases present in that dataset. Additionally, as a smaller language model, it may have limitations in handling complex or highly specialized tasks compared to larger models.

Recommendations

Thoroughly test the model's outputs before using it in any sensitive applications.
Be aware that the model's knowledge is limited to its training data and it may produce incorrect or biased information.
For critical applications, consider using this model in conjunction with other sources of information or larger, more comprehensive models.

How to Get Started with the Model

You can use this model with the Transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Trelis/80M-2percent-corpus-SFT")
tokenizer = AutoTokenizer.from_pretrained("Trelis/80M-2percent-corpus-SFT")

# Example usage
input_text = "What is the capital of France?"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids, max_length=50)
response = tokenizer.decode(output[0], skip_special_tokens=True)
print(response)

Trelis
/

TrelisSmolLM-instruct