-
SciCode: A Research Coding Benchmark Curated by Scientists
Paper • 2407.13168 • Published • 13 -
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
Paper • 2407.15711 • Published • 9 -
The Vision of Autonomic Computing: Can LLMs Make It a Reality?
Paper • 2407.14402 • Published • 13
Michael Chen
michaelchen
AI & ML interests
None yet
Organizations
Collections
2
models
None public yet
datasets
None public yet