Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
OpenHands
community
https://github.com/All-Hands-AI/OpenHands
Activity Feed
Request to join this org
Follow
40
AI & ML interests
None defined yet.
Recent Activity
JustinLin610
Â
authored
a paper
about 5 hours ago
RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques
yuexiang96
Â
authored
a paper
3 days ago
Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos
JustinLin610
Â
authored
a paper
5 days ago
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models
View all activity
Team members
16
spaces
1
Build error
36
🙌
OpenHands Evaluation Benchmark
models
1
OpenHands/CodeQwen1.5-7B-OpenDevin
Text Generation
•
Updated
May 25, 2024
•
20
•
16
datasets
7
Sort:Â Recently updated
OpenHands/eval-output-webarena
Updated
Jul 20, 2024
•
10
OpenHands/eval-browsing-instructions
Viewer
•
Updated
Jul 15, 2024
•
933
•
4
OpenHands/eval-output-miniwob
Updated
Jun 10, 2024
•
2
OpenHands/SWE-bench-devin-passed
Viewer
•
Updated
Apr 9, 2024
•
79
•
38
OpenHands/SWE-bench-devin-full-filtered
Viewer
•
Updated
Apr 9, 2024
•
450
•
36
•
1
OpenHands/SWE-bench-devin-full
Viewer
•
Updated
Apr 9, 2024
•
570
•
36
OpenHands/Devin-SWE-bench-output
Viewer
•
Updated
Mar 21, 2024
•
1.14k
•
38