Update README.md
Browse files
README.md
CHANGED
@@ -36,7 +36,7 @@ base_model:
|
|
36 |
|
37 |
|
38 |
[Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) is a 7B parameter multilingual large language model (LLM) pre-trained with 4T tokens within the research project OpenGPT-X.
|
39 |
-
Teuken-7B-instruct-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4).
|
40 |
|
41 |
|
42 |
### Model Description
|
@@ -52,7 +52,8 @@ Teuken-7B-instruct-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4]
|
|
52 |
## Uses
|
53 |
|
54 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
55 |
-
Teuken-7B-
|
|
|
56 |
|
57 |
## Disclaimer Toxic Content:
|
58 |
|
@@ -77,7 +78,7 @@ Teuken-7B-instruct-research-v0.4 is an instruction-tuned version of [Teuken-7B-b
|
|
77 |
The model requires transformers, sentencepiece, and the torch library.
|
78 |
After installation, here's an example of how to use the model:
|
79 |
|
80 |
-
|
81 |
```python
|
82 |
user="Hi!"
|
83 |
lang_code = "DE"
|
@@ -91,6 +92,7 @@ system_messages={
|
|
91 |
prompt = f"System: {system_messages[lang_code]}\nUser: {user}\nAssistant:<s>"
|
92 |
```
|
93 |
|
|
|
94 |
```python
|
95 |
import torch
|
96 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
36 |
|
37 |
|
38 |
[Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) is a 7B parameter multilingual large language model (LLM) pre-trained with 4T tokens within the research project OpenGPT-X.
|
39 |
+
Teuken-7B-instruct-research-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4).
|
40 |
|
41 |
|
42 |
### Model Description
|
|
|
52 |
## Uses
|
53 |
|
54 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
55 |
+
[Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) focuses on covering all 24 EU languages and therefore renders more stable results across these languages and better reflects European values in its answers than English-centric models. It is therefore specialized for use in multilingual tasks.
|
56 |
+
Since the underlying base model is trained on all 24 EU languages, Teuken-7B-instruct-research-v0.4 is also intended for research use in these 24 languages. Teuken-7B-instruct-research-v0.4 is a fine-tuned variant with a special focus on German and English language.
|
57 |
|
58 |
## Disclaimer Toxic Content:
|
59 |
|
|
|
78 |
The model requires transformers, sentencepiece, and the torch library.
|
79 |
After installation, here's an example of how to use the model:
|
80 |
|
81 |
+
As this model is a fine-tuned model, it must be used with the provided prompt template. Using the model without the prompt template is not intended and is not recommended. The prompt template is defined as follows:
|
82 |
```python
|
83 |
user="Hi!"
|
84 |
lang_code = "DE"
|
|
|
92 |
prompt = f"System: {system_messages[lang_code]}\nUser: {user}\nAssistant:<s>"
|
93 |
```
|
94 |
|
95 |
+
The prompt template is also directly integrated in the Tokenizer and can be used as follows:
|
96 |
```python
|
97 |
import torch
|
98 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|