mlfoundations-dev/dpo_from_stratos_judged_annotated_rejected_responses Text Generation • Updated Feb 5 • 433 • 1
mlfoundations-dev/dpo_from_multiple_samples_shortest_numina_aime Text Generation • Updated Feb 6 • 462