V2 update
Browse files
README.md
CHANGED
@@ -13,11 +13,11 @@ tags:
|
|
13 |
|
14 |
OPT was first introduced in [Open Pre-trained Transformer Language Models](https://arxiv.org/abs/2205.01068) and first released in [metaseq's repository](https://github.com/facebookresearch/metaseq) on May 3rd 2022 by Meta AI.
|
15 |
|
16 |
-
This model is [facebook/opt-
|
17 |
|
18 |
-
Low-rank adapters (r=16) finetuned over
|
19 |
|
20 |
-
The model reaches a train ppl of
|
21 |
|
22 |
### Inference Example (Chain-of-Thought prompt):
|
23 |
|
|
|
13 |
|
14 |
OPT was first introduced in [Open Pre-trained Transformer Language Models](https://arxiv.org/abs/2205.01068) and first released in [metaseq's repository](https://github.com/facebookresearch/metaseq) on May 3rd 2022 by Meta AI.
|
15 |
|
16 |
+
This model is [facebook/opt-1.3b](https://hf.co/facebook/opt-1.3b) finetuned with low-rank adapters (https://arxiv.org/abs/2106.09685) on the FLAN datasets (https://arxiv.org/pdf/2210.11416.pdf).
|
17 |
|
18 |
+
Low-rank adapters (r=16) finetuned over 4.2m new tokens of a FLAN task mixture, with the start of each example cut off if it was too large to fit within a 256 token context.
|
19 |
|
20 |
+
The model reaches a train ppl of 4.77 and an eval ppl of 4.19.
|
21 |
|
22 |
### Inference Example (Chain-of-Thought prompt):
|
23 |
|