The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms Paper • 2303.00694 • Published Mar 1, 2023
Inverse Reinforcement Learning without Reinforcement Learning Paper • 2303.14623 • Published Mar 26, 2023
ManiCast: Collaborative Manipulation with Cost-Aware Human Forecasting Paper • 2310.13258 • Published Oct 20, 2023 • 2
Demo2Code: From Summarizing Demonstrations to Synthesizing Code via Extended Chain-of-Thought Paper • 2305.16744 • Published May 26, 2023 • 1
InteRACT: Transformer Models for Human Intent Prediction Conditioned on Robot Actions Paper • 2311.12943 • Published Nov 21, 2023
UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations Paper • 2311.08469 • Published Nov 14, 2023 • 11
Learning Shared Safety Constraints from Multi-task Demonstrations Paper • 2309.00711 • Published Sep 1, 2023
Learning to Move Like Professional Counter-Strike Players Paper • 2408.13934 • Published Aug 25, 2024 • 23
Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback Paper • 2410.05434 • Published Oct 7, 2024
leap-llm/Meta-Llama-3-8B-Instruct-sft-self-correct-webshop-iter2 Text Generation • Updated Nov 21, 2024 • 8
leap-llm/Meta-Llama-3-8B-Instruct-sft-alfworld-webshop-intercode-iter0 Text Generation • Updated Nov 21, 2024 • 10
leap-llm/Meta-Llama-3-8B-Instruct-sft-alfworld-webshop-intercode-iter1 Text Generation • Updated Nov 20, 2024 • 10
leap-llm/Meta-Llama-3-8B-Instruct-sft-self-correct-webshop-iter1 Text Generation • Updated Nov 19, 2024 • 9
leap-llm/Meta-Llama-3.1-8B-Instruct-sft-intercode-bash-iter0 Text Generation • Updated Sep 27, 2024 • 7
leap-llm/Meta-Llama-3.1-70B-Instruct-sft-intercode-bash-iter1 Text Generation • Updated Sep 15, 2024 • 4
leap-llm/Meta-Llama-3.1-70B-Instruct-sft-intercode-bash-iter0 Text Generation • Updated Sep 15, 2024 • 4
leap-llm/Meta-Llama-3.1-8B-Instruct-sft-intercode-bash-iter1 Text Generation • Updated Sep 14, 2024 • 6
leap-llm/Meta-Llama-3-8B-Instruct-sft-intercode-bash-iter1 Text Generation • Updated Sep 13, 2024 • 6
leap-llm/Meta-Llama-3-8B-Instruct-sft-self-correct-alfworld-iter2 Text Generation • Updated Sep 13, 2024 • 11