Text Generation
Transformers
Safetensors
Finnish
llama
finnish
conversational
text-generation-inference
aapot commited on
Commit
7cf892c
·
verified ·
1 Parent(s): 325822d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -40,20 +40,18 @@ And two instruct-tuned versions:
40
 
41
  ## Intended uses & limitations
42
 
43
- This model was trained only with Finnish texts excluding code so it should not be used for multilingual and code generation use cases.
44
-
45
  This model was pretrained only in a self-supervised way, without any supervised training. You can use this model for text generation or fine-tune it for a downstream task. This model followed a 2-stage pretraining approach where single-turn instruction-following examples were mixed in with the other training data in the second stage (explained more later in this readme). Thanks to this approach, this pretrained model is already capable of instruction following, but you might get even better results if you specifically fine-tune it for instruction following or other use cases. For instruction-following fine-tuning, you should use the same prompt format showcased below.
46
 
47
  ### How to use
48
 
49
- ### Fine-tuning
50
 
51
  We have now added finetuning example notebook along with video! \
52
  Notebook: https://huggingface.co/Finnish-NLP/Ahma-3B/blob/main/Finetune_Ahma_3B_example.ipynb \
53
  Video: https://www.youtube.com/watch?v=6mbgn9XzpS4
54
 
55
 
56
- ### Inference
57
 
58
  If you want to use this model for instruction-following, you need to use the same prompt format we used in the second stage of the pretraining (basically the same format what Meta used in their Llama2 models). **Note: do not use "LlamaTokenizer" from transformers library but always use the AutoTokenizer instead, or use the plain sentencepiece tokenizer.** Here is an example using the instruction-following prompt format, with some generation arguments you can modify for your use:
59
 
@@ -110,6 +108,8 @@ You may experiment with different system prompt instructions too if you like.
110
 
111
  ### Limitations and bias
112
 
 
 
113
  The training data used for this model contains a lot of content from the internet, which is far from neutral. Therefore, the model can have biased predictions. This bias will also affect all fine-tuned versions of this model.
114
 
115
  To reduce toxic content, training data was filtered with a toxicity classifier but it cannot truly eliminate all toxic text.
 
40
 
41
  ## Intended uses & limitations
42
 
 
 
43
  This model was pretrained only in a self-supervised way, without any supervised training. You can use this model for text generation or fine-tune it for a downstream task. This model followed a 2-stage pretraining approach where single-turn instruction-following examples were mixed in with the other training data in the second stage (explained more later in this readme). Thanks to this approach, this pretrained model is already capable of instruction following, but you might get even better results if you specifically fine-tune it for instruction following or other use cases. For instruction-following fine-tuning, you should use the same prompt format showcased below.
44
 
45
  ### How to use
46
 
47
+ #### Fine-tuning
48
 
49
  We have now added finetuning example notebook along with video! \
50
  Notebook: https://huggingface.co/Finnish-NLP/Ahma-3B/blob/main/Finetune_Ahma_3B_example.ipynb \
51
  Video: https://www.youtube.com/watch?v=6mbgn9XzpS4
52
 
53
 
54
+ #### Inference
55
 
56
  If you want to use this model for instruction-following, you need to use the same prompt format we used in the second stage of the pretraining (basically the same format what Meta used in their Llama2 models). **Note: do not use "LlamaTokenizer" from transformers library but always use the AutoTokenizer instead, or use the plain sentencepiece tokenizer.** Here is an example using the instruction-following prompt format, with some generation arguments you can modify for your use:
57
 
 
108
 
109
  ### Limitations and bias
110
 
111
+ This model was trained only with Finnish texts excluding code so it should not be used for multilingual and code generation use cases.
112
+
113
  The training data used for this model contains a lot of content from the internet, which is far from neutral. Therefore, the model can have biased predictions. This bias will also affect all fine-tuned versions of this model.
114
 
115
  To reduce toxic content, training data was filtered with a toxicity classifier but it cannot truly eliminate all toxic text.