pankajmathur
commited on
Commit
•
6a902cb
1
Parent(s):
31e1a7b
Update README.md
Browse files
README.md
CHANGED
@@ -114,7 +114,17 @@ model-index:
|
|
114 |
---
|
115 |
# orca_mini_3b
|
116 |
|
117 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
118 |
|
119 |
<a target="_blank" href="https://colab.research.google.com/#fileId=https://huggingface.co/psmathur/orca_mini_3b/blob/main/orca_mini_3b_T4_GPU.ipynb">
|
120 |
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
@@ -123,7 +133,7 @@ Use orca-mini-3b on Free Google Colab with T4 GPU :)
|
|
123 |
An [OpenLLaMa-3B model](https://github.com/openlm-research/open_llama) model trained on explain tuned datasets, created using Instructions and Input from WizardLM, Alpaca & Dolly-V2 datasets and applying Orca Research Paper dataset construction approaches.
|
124 |
|
125 |
|
126 |
-
|
127 |
|
128 |
We build explain tuned [WizardLM dataset ~70K](https://github.com/nlpxucan/WizardLM), [Alpaca dataset ~52K](https://crfm.stanford.edu/2023/03/13/alpaca.html) & [Dolly-V2 dataset ~15K](https://github.com/databrickslabs/dolly) created using approaches from [Orca Research Paper](https://arxiv.org/abs/2306.02707).
|
129 |
|
@@ -134,7 +144,7 @@ This helps student model aka this model to learn ***thought*** process from teac
|
|
134 |
Please see below example usage how the **System** prompt is added before each **instruction**.
|
135 |
|
136 |
|
137 |
-
|
138 |
|
139 |
The training configurations are provided in the table below.
|
140 |
|
@@ -156,7 +166,7 @@ Here are some of params used during training:
|
|
156 |
|
157 |
|
158 |
|
159 |
-
|
160 |
|
161 |
Below shows an example on how to use this model
|
162 |
|
@@ -230,8 +240,6 @@ Sincerely,
|
|
230 |
```
|
231 |
|
232 |
|
233 |
-
**P.S. I am #opentowork and #collaboration, if you can help, please reach out to me at www.linkedin.com/in/pankajam**
|
234 |
-
|
235 |
|
236 |
Next Goals:
|
237 |
1) Try more data like actually using FLAN-v2, just like Orka Research Paper (I am open for suggestions)
|
@@ -304,7 +312,7 @@ If you found wizardlm_alpaca_dolly_orca_open_llama_3b useful in your research or
|
|
304 |
howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
|
305 |
}
|
306 |
```
|
307 |
-
|
308 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psmathur__orca_mini_3b)
|
309 |
|
310 |
| Metric | Value |
|
@@ -318,7 +326,7 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
|
|
318 |
| GSM8K (5-shot) | 0.08 |
|
319 |
| DROP (3-shot) | 14.33 |
|
320 |
|
321 |
-
|
322 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psmathur__orca_mini_3b)
|
323 |
|
324 |
| Metric |Value|
|
|
|
114 |
---
|
115 |
# orca_mini_3b
|
116 |
|
117 |
+
<img src="https://huggingface.co/pankajmathur/orca_mini_v5_8b/resolve/main/orca_minis_small.jpeg" width="auto" />
|
118 |
+
|
119 |
+
<strong>
|
120 |
+
Passionate about Generative AI? I help companies to privately train and deploy custom LLM/MLLM affordably. For startups, I can even assist with securing GPU grants to get you started. Let's chat!
|
121 |
+
|
122 |
+
<a href="https://www.linkedin.com/in/pankajam" target="_blank">https://www.linkedin.com/in/pankajam</a> Looking forward to connecting!
|
123 |
+
</strong>
|
124 |
+
|
125 |
+
<br>
|
126 |
+
|
127 |
+
**Use orca-mini-3b for Free on Google Colab with T4 GPU :)**
|
128 |
|
129 |
<a target="_blank" href="https://colab.research.google.com/#fileId=https://huggingface.co/psmathur/orca_mini_3b/blob/main/orca_mini_3b_T4_GPU.ipynb">
|
130 |
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
|
|
133 |
An [OpenLLaMa-3B model](https://github.com/openlm-research/open_llama) model trained on explain tuned datasets, created using Instructions and Input from WizardLM, Alpaca & Dolly-V2 datasets and applying Orca Research Paper dataset construction approaches.
|
134 |
|
135 |
|
136 |
+
### Dataset
|
137 |
|
138 |
We build explain tuned [WizardLM dataset ~70K](https://github.com/nlpxucan/WizardLM), [Alpaca dataset ~52K](https://crfm.stanford.edu/2023/03/13/alpaca.html) & [Dolly-V2 dataset ~15K](https://github.com/databrickslabs/dolly) created using approaches from [Orca Research Paper](https://arxiv.org/abs/2306.02707).
|
139 |
|
|
|
144 |
Please see below example usage how the **System** prompt is added before each **instruction**.
|
145 |
|
146 |
|
147 |
+
### Training
|
148 |
|
149 |
The training configurations are provided in the table below.
|
150 |
|
|
|
166 |
|
167 |
|
168 |
|
169 |
+
### Example Usage
|
170 |
|
171 |
Below shows an example on how to use this model
|
172 |
|
|
|
240 |
```
|
241 |
|
242 |
|
|
|
|
|
243 |
|
244 |
Next Goals:
|
245 |
1) Try more data like actually using FLAN-v2, just like Orka Research Paper (I am open for suggestions)
|
|
|
312 |
howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
|
313 |
}
|
314 |
```
|
315 |
+
### [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
316 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psmathur__orca_mini_3b)
|
317 |
|
318 |
| Metric | Value |
|
|
|
326 |
| GSM8K (5-shot) | 0.08 |
|
327 |
| DROP (3-shot) | 14.33 |
|
328 |
|
329 |
+
### [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
330 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psmathur__orca_mini_3b)
|
331 |
|
332 |
| Metric |Value|
|