RL LLM AGENT

community

https://www.sanjibanchoudhury.com/

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

sc2582 updated a model 25 days ago

rl-llm-agent/Llama-3.2-3B-Instruct-sft-alfworld-leap-iter1

sc2582 published a model 25 days ago

rl-llm-agent/Llama-3.2-3B-Instruct-sft-alfworld-leap-iter1

sc2582 updated a model about 2 months ago

rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-iqlearn-iter1

View all activity

rl-llm-agent's activity

sc2582

updated a model 25 days ago

rl-llm-agent/Llama-3.2-3B-Instruct-sft-alfworld-leap-iter1

Text Generation • Updated 25 days ago • 13

sc2582

published a model 25 days ago

rl-llm-agent/Llama-3.2-3B-Instruct-sft-alfworld-leap-iter1

Text Generation • Updated 25 days ago • 13

sc2582

updated a model about 2 months ago

rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-iqlearn-iter1

Updated Jan 20 • 12

sc2582

published 9 models about 2 months ago

rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-exploration-aflworld-iter0-checkpoint-50

Updated Jan 16 • 8

sc2582

updated 8 models about 2 months ago

rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-exploration-aflworld-iter0-checkpoint-50

Updated Jan 16 • 8

rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-iter2-70k

Updated Jan 16 • 7

rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-shaped-iter0

Updated Jan 14 • 6

rl-llm-agent/Llama-3.2-3B-Instruct-value-alfworld-8b-sft

Updated Jan 13 • 8

rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iqlearn-iter0

Updated Jan 13 • 14

rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-iqlearn-iter0

Updated Jan 13 • 8

rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iter2

Updated Jan 11 • 11

rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iter1

Text Generation • Updated Jan 10 • 39

AI & ML interests

Recent Activity

Team members 2

rl-llm-agent's activity