arxiv:2405.07863
Wei Xiong
weqweasdas
AI & ML interests
Machine learning, RLHF
Recent Activity
updated
a dataset
about 1 hour ago
selfcorrexp2/llama31_first_wrong_and_first_corr_regular_norr
updated
a dataset
about 1 hour ago
selfcorrexp2/10k_llama31_first_wrong_math_chat_format
updated
a dataset
about 12 hours ago
selfcorrexp2/llama31_no_additional_chat_format_add40k
Organizations
models
23
weqweasdas/zephyr-7b-dpo-full
Text Generation
•
Updated
•
20
weqweasdas/zephyr-7b-gemma-dpo
Updated
weqweasdas/zephyr-7b-sft-full
Updated
weqweasdas/zephyr-7b-dpo-qlora
Updated
weqweasdas/gpt2-cpt-dutch
Text Generation
•
Updated
•
22
weqweasdas/zephyr-7b-gemma-sft
Updated
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6_weight085
Text Generation
•
Updated
•
12
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6
Text Generation
•
Updated
•
10
weqweasdas/raft_baseline_zephyr_packing_model6
Text Generation
•
Updated
•
15
weqweasdas/raft_baseline_openchat_llama13b_model1
Text Generation
•
Updated
•
13
datasets
88
weqweasdas/llama3_aug_and_base_math_train_n45
Viewer
•
Updated
•
837k
•
63
weqweasdas/llama3_aug_math_train_n45
Viewer
•
Updated
•
1.01M
•
63
weqweasdas/llama3_math_train_n45
Viewer
•
Updated
•
337k
•
68
weqweasdas/llama3_70b_complte_llama3_8b_self_corr_sft
Viewer
•
Updated
•
37.5k
•
43
weqweasdas/prompt_for_test_reflection
Viewer
•
Updated
•
20.7k
•
54
weqweasdas/prompt_for_gen_reflection
Viewer
•
Updated
•
2.64k
•
40
weqweasdas/base_llama3_8b_prompt
Viewer
•
Updated
•
37.4k
•
46
weqweasdas/tmpzzz
Viewer
•
Updated
•
810k
•
48
weqweasdas/new_8b_self_corr_standard
Viewer
•
Updated
•
2.57M
•
119
weqweasdas/new_8b_self_corr_sft
Viewer
•
Updated
•
93.8k
•
47