Abstract
TextArena is an open-source collection of competitive text-based games for training and evaluation of agentic behavior in Large Language Models (LLMs). It spans 57+ unique environments (including single-player, two-player, and multi-player setups) and allows for easy evaluation of model capabilities via an online-play system (against humans and other submitted models) with real-time TrueSkill scores. Traditional benchmarks rarely assess dynamic social skills such as negotiation, theory of mind, and deception, creating a gap that TextArena addresses. Designed with research, community and extensibility in mind, TextArena emphasizes ease of adding new games, adapting the framework, testing models, playing against the models, and training models. Detailed documentation of environments, games, leaderboard, and examples are available on https://github.com/LeonGuertler/TextArena and https://www.textarena.ai/.
Community
You can play with the models here: https://www.textarena.ai/
Leaderboard: https://www.textarena.ai/leaderboard
Code: https://github.com/LeonGuertler/TextArena
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models (2025)
- TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning (2025)
- Digital Player: Evaluating Large Language Models based Human-like Agent in Games (2025)
- Among Them: A game-based framework for assessing persuasion capabilities of LLMs (2025)
- Are Large Vision Language Models Good Game Players? (2025)
- WebGames: Challenging General-Purpose Web-Browsing AI Agents (2025)
- Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper