Text Generation
Transformers
Safetensors
llama
text-generation-inference
Inference Endpoints
danielsteinigen commited on
Commit
c870e86
·
verified ·
1 Parent(s): 39302de

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -36,7 +36,7 @@ base_model:
36
 
37
 
38
  [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) is a 7B parameter multilingual large language model (LLM) pre-trained with 4T tokens within the research project OpenGPT-X.
39
- Teuken-7B-instruct-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4).
40
 
41
 
42
  ### Model Description
@@ -52,7 +52,8 @@ Teuken-7B-instruct-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4]
52
  ## Uses
53
 
54
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
55
- Teuken-7B-instruct-research-v0.4 is intended for research use in all official 24 European languages. Since Teuken-7B-instruct-research-v0.4 focuses on covering all 24 EU languages, it renders more stable results across these languages and better reflects European values in its answers than English-centric models. It is therefore specialized for use in multilingual tasks.
 
56
 
57
  ## Disclaimer Toxic Content:
58
 
@@ -77,7 +78,7 @@ Teuken-7B-instruct-research-v0.4 is an instruction-tuned version of [Teuken-7B-b
77
  The model requires transformers, sentencepiece, and the torch library.
78
  After installation, here's an example of how to use the model:
79
 
80
- The prompt template for the fine-tuned model is defined as follows:
81
  ```python
82
  user="Hi!"
83
  lang_code = "DE"
@@ -91,6 +92,7 @@ system_messages={
91
  prompt = f"System: {system_messages[lang_code]}\nUser: {user}\nAssistant:<s>"
92
  ```
93
 
 
94
  ```python
95
  import torch
96
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
36
 
37
 
38
  [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) is a 7B parameter multilingual large language model (LLM) pre-trained with 4T tokens within the research project OpenGPT-X.
39
+ Teuken-7B-instruct-research-v0.4 is an instruction-tuned version of [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4).
40
 
41
 
42
  ### Model Description
 
52
  ## Uses
53
 
54
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
55
+ [Teuken-7B-base-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-base-v0.4) focuses on covering all 24 EU languages and therefore renders more stable results across these languages and better reflects European values in its answers than English-centric models. It is therefore specialized for use in multilingual tasks.
56
+ Since the underlying base model is trained on all 24 EU languages, Teuken-7B-instruct-research-v0.4 is also intended for research use in these 24 languages. Teuken-7B-instruct-research-v0.4 is a fine-tuned variant with a special focus on German and English language.
57
 
58
  ## Disclaimer Toxic Content:
59
 
 
78
  The model requires transformers, sentencepiece, and the torch library.
79
  After installation, here's an example of how to use the model:
80
 
81
+ As this model is a fine-tuned model, it must be used with the provided prompt template. Using the model without the prompt template is not intended and is not recommended. The prompt template is defined as follows:
82
  ```python
83
  user="Hi!"
84
  lang_code = "DE"
 
92
  prompt = f"System: {system_messages[lang_code]}\nUser: {user}\nAssistant:<s>"
93
  ```
94
 
95
+ The prompt template is also directly integrated in the Tokenizer and can be used as follows:
96
  ```python
97
  import torch
98
  from transformers import AutoModelForCausalLM, AutoTokenizer