Multimodal Classification Model (BM-v1)
This model combines text and image inputs to predict player moves from in-game screenshots for the popular 4X Civilization VI. In use, screenshot inputs are provided and text inputs generated using an LLM.
Model Details
- Developed by: BeakerStreet
- Model type: Multimodal Classification Model
- Language(s): English
- License: MIT
Uses
Predicts the likely moves a player will make from a complete sample space of all (observed) player moves, based on a provided screenshot and associated text. Can be fine-tuned to specifically predict types of move (scouting, build orders, settle/doesn't settle)
Direct Use
Predicts the likely moves a player will make, from a complete sample space of all player moves, based on a provided screenshot and associated text.
Inference API (serverless) does not yet support tensorflow models for this pipeline type.