|
--- |
|
license: cc-by-sa-4.0 |
|
language: |
|
- ko |
|
- en |
|
tags: |
|
- moe |
|
--- |
|
|
|
# **Synatra-Mixtral-8x7B** |
|
<img src="./Synatra-Mixtral.png" alt="Synatra-Mixtral-8x7B" width="512"/> |
|
|
|
|
|
**Synatra-Mixtral-8x7B** is a fine-tuned version of the Mixtral-8x7B-Instruct-v0.1 model using **Korean** datasets. |
|
|
|
This model features overwhelmingly superior comprehension and inference capabilities and is licensed under CC-BY-SA. |
|
|
|
# **EXL2 Info** |
|
|
|
[measurement.json](./measurement.json) |
|
|
|
8.0bpw, 6.0bpw, 4.0bpw, 3.5bpw, 3.0bpw, 2.6bpw, 2.3bpw |
|
|
|
<img src="./measurement.png" alt="Measurement" width="512"/> |
|
|
|
|
|
|
|
# **License** |
|
|
|
The "Model" is completely free (ie. base model, derivates, merges/mixes) to use for non-commercial purposes as long as the the included **cc-by-sa-4.0** license in any parent repository, and the non-commercial use statute remains, regardless of other models' licences. |
|
|
|
# **Model Details** |
|
|
|
**Base Model** |
|
[mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co./mistralai/Mixtral-8x7B-Instruct-v0.1) |
|
|
|
**Trained On** |
|
A100 80GB * 6 |
|
|
|
**Instruction format** |
|
|
|
It follows **Alpaca** format. |
|
``` |
|
Below is an instruction that describes a task. Write a response that appropriately completes the request. |
|
|
|
### Instruction: |
|
{input} |
|
|
|
### Response: |
|
{output} |
|
``` |
|
|
|
# **Model Benchmark** |
|
TBD |
|
|
|
# **Implementation Code** |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
device = "cuda" # the device to load the model onto |
|
|
|
model = AutoModelForCausalLM.from_pretrained("maywell/Synatra-Mixtral-8x7B") |
|
tokenizer = AutoTokenizer.from_pretrained("maywell/Synatra-Mixtral-8x7B") |
|
|
|
messages = [ |
|
{"role": "user", "content": "μμΈμνμΈμ μλμ±μ΄λ‘ μ λν΄μ μμΈν μ€λͺ
ν΄μ€."}, |
|
] |
|
|
|
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt") |
|
|
|
model_inputs = encodeds.to(device) |
|
model.to(device) |
|
|
|
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True) |
|
decoded = tokenizer.batch_decode(generated_ids) |
|
print(decoded[0]) |
|
``` |
|
|
|
# **Author's Message** |
|
|
|
This model's training got sponsered by no one but support from people around Earth. |
|
|
|
[Support Me](https://www.buymeacoffee.com/mwell) |
|
|
|
Follow me on twitter: https://twitter.com/stablefluffy |