Resources for EMNLP 2024 Paper: Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
J Li
jiazhengli
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
20 days ago
Two Heads Are Better Than One: Dual-Model Verbal Reflection at
Inference-Time
updated
a model
23 days ago
jiazhengli/Qwen2.5-7B-RoleMRC-sft
updated
a model
23 days ago
jiazhengli/Qwen2.5-7B-RoleMRC-dpo
Organizations
None yet
Collections
3
Resources for EMNLP 2024 Paper: Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
-
jiazhengli/Pythia-2.8B-HH-RLHF-Iterative-SamPO
Text Generation • Updated • 6 -
jiazhengli/Pythia-2.8B-TLDR-Iterative-SamPO
Text Generation • Updated • 6 -
Junrulu/Llama-3-8B-Instruct-Iterative-SamPO
Text Generation • Updated • 5 • 1 -
Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
Paper • 2406.10957 • Published • 1
models
12
jiazhengli/Qwen2.5-7B-RoleMRC-sft
Updated
•
15
jiazhengli/Qwen2.5-7B-RoleMRC-dpo
Updated
•
12
jiazhengli/Llama-3.1-8B-RoleMRC-sft
Updated
•
16
jiazhengli/Llama-3.1-8B-RoleMRC-dpo
Updated
•
8
jiazhengli/long-t5-tglobal-large-AERA
Text2Text Generation
•
Updated
•
9
jiazhengli/Mixtral-8x7B-Instruct-v0.1-QLoRA-Assessment-Rationale-dpo
Updated
•
3
jiazhengli/Mixtral-8x7B-Instruct-v0.1-QLoRA-Assessment-Rationale-sft
Updated
•
5
jiazhengli/Meta-Llama-3-8B-QLoRA-Assessment-Rationale-sft
Updated
•
5
jiazhengli/Meta-Llama-3-8B-QLoRA-Assessment-Rationale-dpo
Updated
•
1
•
1
jiazhengli/deberta-v3-large-Rationale-to-Score
Text Classification
•
Updated
•
14
•
1