Great reward model, what dataset did you use to train?

by zolicsaki - opened Jul 25

Jul 25

Specifically I was wondering if you trained it on lmsys chatbot arena conversations, because your model is performing so well when evaluated on those preferences. Thanks for the help!

https://huggingface.co./datasets/lmsys/chatbot_arena_conversations

zolicsaki

Aug 6

@RangiLyu

zolicsaki

Aug 6

@ZwwWayne

RangiLyu

InternLM org Aug 7

Sorry for the late reply. We did use a portion of this dataset. We performed data cleaning and filtering, including removing toxic and unsafe data, to ensure quality and safety.

zolicsaki

Aug 8

@RangiLyu Thanks !

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment