|
--- |
|
library_name: transformers |
|
datasets: |
|
- Intel/orca_dpo_pairs |
|
language: |
|
- en |
|
tags: |
|
- mistral-7b |
|
- mistral |
|
- dpo |
|
- neuralhermes |
|
- instruct |
|
- rlhf |
|
- notebook |
|
- endtoend |
|
license: apache-2.0 |
|
--- |
|
|
|
- Based model `teknium/OpenHermes-2.5-Mistral-7B` |
|
- Refined using Direct Preference Optimization (DPO) with the `Intel/orca_dpo_pairs`. |
|
## Uses |
|
|
|
### Direct Use |
|
|
|
Way 1 (see the next one for faster inference `Way 2`) |
|
|
|
```python |
|
|
|
import transformers |
|
from transformers import AutoTokenizer |
|
|
|
new_model="abdullahalzubaer/NeuralHermes-2.5-Mistral-7B" |
|
|
|
# Format prompt |
|
message = [ |
|
{"role": "system", "content": "You are a helpful assistant chatbot."}, |
|
{"role": "user", "content": "What is a Large Language Model?"} |
|
] |
|
tokenizer = AutoTokenizer.from_pretrained(new_model) |
|
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False) |
|
|
|
# Create pipeline |
|
pipeline = transformers.pipeline( |
|
"text-generation", |
|
model=new_model, |
|
tokenizer=tokenizer |
|
) |
|
|
|
# Generate text |
|
sequences = pipeline( |
|
prompt, |
|
do_sample=True, |
|
temperature=0.7, |
|
top_p=0.9, |
|
num_return_sequences=1, |
|
max_length=200, |
|
) |
|
print(sequences[0]['generated_text']) |
|
|
|
``` |
|
|
|
Sample Output from `abdullahalzubaer/NeuralHermes-2.5-Mistral-7B` |
|
|
|
|
|
``` |
|
<|im_start|>system |
|
You are a helpful assistant chatbot.<|im_end|> |
|
<|im_start|>user |
|
What is a Large Language Model?<|im_end|> |
|
<|im_start|>assistant |
|
A large language model is an artificial intelligence system designed to process and understand large amounts of natural language data. |
|
It's a type of machine learning model, typically built using neural networks, |
|
that is trained on vast datasets of text to learn patterns and relationships within the language. |
|
These models can then generate human-like text, predict the next word in a sequence, perform language translation, |
|
and answer questions, among other tasks. The "large" in the term refers to the size of the model, which includes |
|
the number of parameters, the complexity of the architecture, and the amount of training data it processes. |
|
As a result, large language models are capable of generating more complex and coherent responses compared to smaller models. |
|
``` |
|
|
|
Sample Output from `mlabonne/NeuralHermes-2.5-Mistral-7B` (provided as in the [tutorial](https://mlabonne.github.io/blog/posts/Fine_tune_Mistral_7b_with_DPO.html)) |
|
``` |
|
<|im_start|>system |
|
You are a helpful assistant chatbot.<|im_end|> |
|
<|im_start|>user |
|
What is a Large Language Model?<|im_end|> |
|
<|im_start|>assistant |
|
A large language model is a type of artificial intelligence (AI) system that has been trained on vast amounts of text data. |
|
These models are designed to understand and generate human language, allowing them to perform various natural |
|
language processing tasks, such as text generation, language translation, and question answering. Large language models |
|
typically use deep learning techniques, like recurrent neural networks (RNNs) or transformers, to learn patterns and |
|
relationships in the data, enabling them to generate coherent and contextually relevant responses. |
|
The size of these models, in terms of the number of parameters and the volume of data they are trained on, |
|
plays a significant role in their ability to comprehend and produce complex language structures. |
|
|
|
``` |
|
|
|
Therefore it worked maybe not as good as the original model but still close to it (due to max lenght in DPOTrainer?) |
|
|
|
|
|
Way 2 (not sure but it is significantly faster than Way 1 above - therefore I recommend this. Taken directly from |
|
[mistral model card](https://huggingface.co./mistralai/Mistral-7B-Instruct-v0.2) and just replaced with my model) |
|
|
|
```python |
|
|
|
import torch |
|
import transformers |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import trl |
|
from trl import AutoModelForCausalLMWithValueHead, PPOConfig, PPOTrainer |
|
print(torch.__version__) |
|
print(transformers.__version__) |
|
print(trl.__version__) |
|
|
|
|
|
''' |
|
1.13.0+cu117 |
|
4.38.2 |
|
0.7.11 |
|
''' |
|
|
|
|
|
model_tokenizer = "abdullahalzubaer/NeuralHermes-2.5-Mistral-7B" #lets try my model |
|
# model_tokenizer = "mistralai/Mistral-7B-Instruct-v0.2" |
|
# model_tokenizer = "mistralai/Mixtral-8x7B-Instruct-v0.1" |
|
|
|
model = AutoModelForCausalLM.from_pretrained(model_tokenizer) |
|
tokenizer = AutoTokenizer.from_pretrained(model_tokenizer) |
|
|
|
print(f"Loaded Model = {model.config._name_or_path}") |
|
print(f"Loaded Tokenizer = {tokenizer.name_or_path}") |
|
|
|
# Check available GPUs and print their names |
|
gpu_count = torch.cuda.device_count() |
|
print("Available GPUs:", gpu_count) |
|
for i in range(gpu_count): |
|
print(f"GPU {i}: {torch.cuda.get_device_name(i)}") |
|
|
|
# Choose a specific GPU (e.g., GPU 0) |
|
device_id = 3 # Change this to select a different GPU |
|
device = f"cuda:{device_id}" if torch.cuda.is_available() else "cpu" |
|
print(f"Using device: {device}") |
|
|
|
|
|
your_prompt="""What is a Large Language Model?""" |
|
|
|
messages = [ |
|
{"role": "user", "content": your_prompt}, |
|
] |
|
|
|
|
|
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt") |
|
|
|
model_inputs = encodeds.to(device) |
|
model.to(device) |
|
|
|
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True) |
|
decoded = tokenizer.batch_decode(generated_ids) |
|
print(f"\nComplete I/O:\n{decoded[0]}") |
|
# print(f"Using device: {device}") |
|
# print(f"\nModel Reply:\n{decoded[0].split('[/INST]')[1]}") |
|
|
|
''' |
|
Complete I/O: |
|
<|im_start|> user |
|
What is a Large Language Model? Elaborate. |
|
<|im_end|> |
|
A Large Language Model is a type of artificial intelligence algorithm |
|
designed to generate human-like text or respond to natural language input. |
|
It is typically trained on vast amounts of text data, enabling it to |
|
understand and generate language with a high level of complexity.<|im_end|> |
|
|
|
''' |
|
|
|
``` |
|
# Loss |
|
|
|
| Step | Training Loss | |
|
|-----|---------| |
|
| 1 | 0.693300| |
|
| 2 | 0.693200| |
|
| 3 | 0.692500| |
|
| 4 | 0.691300| |
|
| 5 | 0.68940 | |
|
| ... | ... | |
|
| 45 | 0.633700| |
|
| 46 | 0.629000| |
|
| 47 | 0.591300| |
|
| 48 | 0.558100| |
|
| 49 | 0.585800| |
|
| 50 | 0.558900| |
|
|
|
# Hyperparameters: |
|
|
|
All hyperparameters are as [here](https://mlabonne.github.io/blog/posts/Fine_tune_Mistral_7b_with_DPO.html) except the following |
|
```python |
|
|
|
# for TrainingArguments() |
|
dataloader_num_workers=1, # had to add this #CHANGED_HERE# |
|
dataloader_prefetch_factor=1 |
|
|
|
# for DPOTrainer() |
|
# ref_model (it is not required as prompted by error when I included a reference model: not sure why tho, needs further investigation) |
|
max_prompt_length=256, # had to lower this to 256 #CHANGED_HERE# or else cuda out of memory |
|
max_length=256, # had to lower this to 256 #CHANGED_HERE# cuda out of memory |
|
``` |
|
|
|
# Reference |
|
|
|
Thanks! https://mlabonne.github.io/blog/posts/Fine_tune_Mistral_7b_with_DPO.html |