Wei Xiong's picture

Wei Xiong

weqweasdas

·

https://weixiongust.github.io/WeiXiongUST/index.html

AI & ML interests

Machine learning, RLHF

Recent Activity

updated a dataset about 13 hours ago

mytestdpo/llama3_it_gsm8k_sft_model_gen1_auggsm8k_2nd_round_prompt

updated a dataset about 13 hours ago

mytestdpo/llama3_it_gsm8k_sft_model_gen1_gsm8k_2nd_round_prompt

updated a dataset about 14 hours ago

selfcorrexp2/type12_type4_8b_type3_6k_cut_separate_pr

View all activity

Organizations

weqweasdas's activity

New activity in RLHFlow/LLaMA3-SFT 4 months ago

LLaMA3.1-SFT

#3 opened 4 months ago by

New activity in Qwen/Qwen2.5-Math-RM-72B 4 months ago

example to service the RM

#2 opened 4 months ago by

New activity in RLHFlow/LLaMA3-SFT 4 months ago

How to use llama 3sft model, pipeline or tokenizer.apply_chat_template. Can you provide a simple example? Thank you very much for your contribution

#2 opened 4 months ago by

New activity in RLHFlow/LLaMA3-SFT 5 months ago

Missing BOS token in tokenized text

#1 opened 5 months ago by

New activity in RLHF4MATH/Gemma-7B-it-SFT3epoch 6 months ago

Update README.md

#1 opened 6 months ago by

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 6 months ago

Special tokens in the vocabulary?

#13 opened 6 months ago by

New activity in sfairXC/FsfairX-LLaMA3-RM-v0.1 7 months ago

TypeError: Got unsupported ScalarType BFloat16

#5 opened 7 months ago by

New activity in RLHFlow/pair-preference-model-LLaMA3-8B 7 months ago

Could you please test the consistency of preference between `RLHFlow/pair-preference-model-LLaMA3-8B` and GPT-4 on alpacaeval dataset?

#2 opened 7 months ago by

commented a paper 8 months ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 66 •

New activity in weqweasdas/RM-Mistral-7B 8 months ago

why vocab size is 32001

#3 opened 8 months ago by

New activity in weqweasdas/RM-Mistral-7B 9 months ago

License

#2 opened 9 months ago by

New activity in weqweasdas/RM-Mistral-7B 10 months ago

Fix dataset link

#1 opened 10 months ago by