Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
whoisjones
's Collections
General NER training datasets
MastermindEval
MastermindEval
updated
3 days ago
Evaluating reasoning capabilities of LLMs using the game of Mastermind (paper is coming)
Upvote
-
flair/mastermind_35_mcq_random
Viewer
•
Updated
3 days ago
•
37.1k
•
5
flair/mastermind_46_mcq_random
Viewer
•
Updated
3 days ago
•
36.1k
•
3
flair/mastermind_46_mcq_close
Viewer
•
Updated
3 days ago
•
36.1k
•
3
flair/mastermind_24_mcq_random
Viewer
•
Updated
3 days ago
•
30.4k
•
4
flair/mastermind_24_mcq_close
Viewer
•
Updated
3 days ago
•
30.4k
•
6
flair/mastermind_35_mcq_close
Viewer
•
Updated
3 days ago
•
37.1k
•
9
Upvote
-
Share collection
View history
Collection guide
Browse collections