view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • Nov 13 • 98
view article Article OCR Processing and Text in Image Analysis with Florence-2-base and Qwen2-VL-2B By PandorAI1995 • Oct 18 • 13
Uni-Direction Translation Models Collection HPLT's MT releases. https://github.com/hplt-project/HPLT-MT-Models • 65 items • Updated Oct 24 • 1
OpenCulture Collection A multilingual dataset of public domain books and newspapers. • 27 items • Updated Nov 6 • 121