Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,21 @@
|
|
1 |
-
---
|
2 |
-
license: llama3
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: llama3
|
3 |
+
---
|
4 |
+
|
5 |
+
# Weights from the Llama-3-8B Self-Align Experiments
|
6 |
+
|
7 |
+
[WEIGHTS TO BE UPLOADED ONCE DONE]
|
8 |
+
|
9 |
+
## Training Config
|
10 |
+
|
11 |
+
The `config.yaml` should be used during `accelerate launch`, and `run.sh` was used to launch the training using the [StarCoder2 Self-Align training script](https://github.com/bigcode-project/starcoder2-self-align?tab=readme-ov-file#training-details).
|
12 |
+
Some tweaks were performed to get this working on 48GB vRAM:
|
13 |
+
- FSDP was used
|
14 |
+
- `per_device_batch_size` is `2`
|
15 |
+
- A learning rate of 3e-6 was used
|
16 |
+
|
17 |
+
|
18 |
+
## Environment:
|
19 |
+
|
20 |
+
- Trained with 2x4090 GPUs
|
21 |
+
- 128GB RAM
|