openGPT-X
/

Teuken-7B-instruct-research-v0.4

@@ -36,7 +36,7 @@ base_model:
 [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) is a 7B parameter multilingual large language model (LLM) pre-trained with 4T tokens within the research project OpenGPT-X.
-Teuken-7B-instruct-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4).
 ### Model Description
@@ -52,7 +52,8 @@ Teuken-7B-instruct-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4]
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-Teuken-7B-instruct-research-v0.4 is intended for research use in all official 24 European languages. Since Teuken-7B-instruct-research-v0.4 focuses on covering all 24 EU languages, it renders more stable results across these languages and better reflects European values in its answers than English-centric models. It is therefore specialized for use in multilingual tasks.
 ## Disclaimer Toxic Content:
@@ -77,7 +78,7 @@ Teuken-7B-instruct-research-v0.4 is an instruction-tuned version of [Teuken-7B-b
 The model requires transformers, sentencepiece, and the torch library.
 After installation, here's an example of how to use the model:
-The prompt template for the fine-tuned model is defined as follows:
 ```python
 user="Hi!"
 lang_code = "DE"
@@ -91,6 +92,7 @@ system_messages={
 prompt = f"System: {system_messages[lang_code]}\nUser: {user}\nAssistant:<s>"
 ```
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer

 [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) is a 7B parameter multilingual large language model (LLM) pre-trained with 4T tokens within the research project OpenGPT-X.
+Teuken-7B-instruct-research-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4).
 ### Model Description
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+[Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) focuses on covering all 24 EU languages and therefore renders more stable results across these languages and better reflects European values in its answers than English-centric models. It is therefore specialized for use in multilingual tasks.
+Since the underlying base model is trained on all 24 EU languages, Teuken-7B-instruct-research-v0.4 is also intended for research use in these 24 languages. Teuken-7B-instruct-research-v0.4 is a fine-tuned variant with a special focus on German and English language.
 ## Disclaimer Toxic Content:
 The model requires transformers, sentencepiece, and the torch library.
 After installation, here's an example of how to use the model:
+As this model is a fine-tuned model, it must be used with the provided prompt template. Using the model without the prompt template is not intended and is not recommended. The prompt template is defined as follows:
 ```python
 user="Hi!"
 lang_code = "DE"
 prompt = f"System: {system_messages[lang_code]}\nUser: {user}\nAssistant:<s>"
 ```
+The prompt template is also directly integrated in the Tokenizer and can be used as follows:
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer