Model Card for Model ID
Model Details
Evaluation
llm-jp-eval script(colab)
!git clone https://github.com/llm-jp/llm-jp-eval.git
!cd llm-jp-eval && pip install -e .
!cd llm-jp-eval && python scripts/preprocess_dataset.py --dataset-name all --output-dir ./dataset_dir
!cd llm-jp-eval && python scripts/evaluate_llm.py -cn config.yaml model.pretrained_model_name_or_path=jaeyong2/Qwen2.5-0.5B-Instruct-JaMagpie-Preview tokenizer.pretrained_model_name_or_path=jaeyong2/Qwen2.5-0.5B-Instruct-JaMagpie-Preview dataset_dir=./dataset_dir/1.4.1/evaluation/test
Qwen2.5-0.5B-Instruct | finetuning-model | |
---|---|---|
mmlu | 0.4592 | 0.4614 |
llm-jp-eval | Qwen2.5-0.5B-Instruct | finetuning-model |
---|---|---|
AVG | 0.3037 | 0.3176 |
CG | 0 | 0 |
EL | 0.2637 | 0.3146 |
FA | 0.0386 | 0.0419 |
HE | 0.2700 | 0.3250 |
MC | 0.4033 | 0.3733 |
MR | 0.0900 | 0.2700 |
MT | 0.6148 | 0.6691 |
NLI | 0.5460 | 0.3180 |
QA | 0.2608 | 0.2791 |
RC | 0.5495 | 0.5847 |
License
Qwen/Qwen2.5-0.5B-Instruct : https://choosealicense.com/licenses/apache-2.0/
Acknowledgement
This research is supported by TPU Research Cloud program.
- Downloads last month
- 901
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.