English
nvidia
math
igitman commited on
Commit
00bf679
1 Parent(s): 6989c6f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -34,13 +34,17 @@ The pipeline we used to produce the data and models is fully open-sourced!
34
  - [Models](https://huggingface.co/collections/nvidia/openmath-2-66fb142317d86400783d2c7b)
35
  - [Dataset](https://huggingface.co/datasets/nvidia/OpenMathInstruct-2)
36
 
 
37
 
38
  # How to use the models?
39
 
40
- Our models are fully compatible with Llama3.1-instruct format, so you should be able to just replace an existing Llama3.1 checkpoint and use it in the same way.
41
- Please note that these models have not been instruction tuned and might not provide good answers outside of math domain.
42
 
43
- If you don't know how to use Llama3.1 models, we provide convenient [instructions in our repo](https://github.com/Kipok/NeMo-Skills/blob/main/docs/inference.md).
 
 
 
44
 
45
  # Reproducing our results
46
 
 
34
  - [Models](https://huggingface.co/collections/nvidia/openmath-2-66fb142317d86400783d2c7b)
35
  - [Dataset](https://huggingface.co/datasets/nvidia/OpenMathInstruct-2)
36
 
37
+ See our [paper](https://arxiv.org/abs/2410.01560) to learn more details!
38
 
39
  # How to use the models?
40
 
41
+ Our models are trained with the same "chat format" as Llama3.1-instruct models (same system/user/assistant tokens).
42
+ Please note that these models have not been instruction tuned on general data and thus might not provide good answers outside of math domain.
43
 
44
+ This is a NeMo checkpoint, so you need to use [NeMo Framework](https://github.com/NVIDIA/NeMo) to run inference or finetune it.
45
+ We also release a [HuggingFace checkpoint](https://huggingface.co/nvidia/OpenMath2-Llama3.1-70B) and provide easy instructions on how to
46
+ [convert between different formats](https://github.com/Kipok/NeMo-Skills/blob/main/docs/checkpoint-conversion.md) or
47
+ [run inference](https://github.com/Kipok/NeMo-Skills/blob/main/docs/inference.md) with these models using our codebase.
48
 
49
  # Reproducing our results
50