lightblue
/

DeepSeek-R1-Distill-Qwen-7B-Japanese

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ptrdvn commited on 19 days ago

Commit

adedb66

·

verified ·

1 Parent(s): 56633c4

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -110,7 +110,7 @@ We compare this to the original R1 model and test in both regimes where repetiti
 Code for the SakanaAI/gsm8k-ja-test_250-1319 evaluation can be found [here](https://drive.google.com/file/d/1gCzCJv5vasw8R3KVQimfoIDFyfxwxNvC/view?usp=sharing).
-We further use the first 50 prompts from [DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja](https://huggingface.co/datasets/DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja) to evaluate the percentage of valid Japanese `\<think\>` sections in model responses.
 This benchmark contains more varied and complex prompts, meaning this is a more realistic evaluation of how reliably this model can output Japanese.
 |                                                | Repetition Penalty | Valid Japanese `<think>` (%) |
@@ -132,8 +132,8 @@ We made the data for this model using the following steps:
 4. Generate answers to prompts using [deepseek-ai/DeepSeek-R1-Distill-Llama-70B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B).
 5. Filter out responses which did not:
    * Finish within 2048 tokens
-   * Contain a valid `\<think\>` section
-   * Have the `\<think\>` section written in Japanese
 We used this data to train our model using supervised fine tuning on [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) with the [ecs.gn8is-8x.32xlarge](https://www.alibabacloud.com/help/en/ecs/user-guide/gpu-accelerated-compute-optimized-and-vgpu-accelerated-instance-families-1) instance.

 Code for the SakanaAI/gsm8k-ja-test_250-1319 evaluation can be found [here](https://drive.google.com/file/d/1gCzCJv5vasw8R3KVQimfoIDFyfxwxNvC/view?usp=sharing).
+We further use the first 50 prompts from [DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja](https://huggingface.co/datasets/DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja) to evaluate the percentage of valid Japanese `<think>` sections in model responses.
 This benchmark contains more varied and complex prompts, meaning this is a more realistic evaluation of how reliably this model can output Japanese.
 |                                                | Repetition Penalty | Valid Japanese `<think>` (%) |
 4. Generate answers to prompts using [deepseek-ai/DeepSeek-R1-Distill-Llama-70B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B).
 5. Filter out responses which did not:
    * Finish within 2048 tokens
+   * Contain a valid `<think>` section
+   * Have the `<think>` section written in Japanese
 We used this data to train our model using supervised fine tuning on [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) with the [ecs.gn8is-8x.32xlarge](https://www.alibabacloud.com/help/en/ecs/user-guide/gpu-accelerated-compute-optimized-and-vgpu-accelerated-instance-families-1) instance.