Aleks

aleksfinn23

AI & ML interests

None yet

Recent Activity

liked a Space about 1 month ago
Qwen/Qwen2.5-Turbo-1M-Demo
updated a Space about 2 months ago
Initairu/FLUX
updated a Space about 2 months ago
Initairu/asr
View all activity

Organizations

Initai_ru's profile picture

aleksfinn23's activity

liked a Space about 1 month ago
replied to m-ric's post 3 months ago
reacted to nyuuzyou's post with ❤️ 3 months ago
view post
Post
1962
🎓 Introducing Doc4web.ru Documents Dataset - nyuuzyou/doc4web

Dataset highlights:
- 223,739 documents from doc4web.ru, a document hosting platform for students and teachers
- Primarily in Russian, with some English and potentially other languages
- Each entry includes: URL, title, download link, file path, and content (where available)
- Contains original document files in addition to metadata
- Data reflects a wide range of educational topics and materials
- Licensed under Creative Commons Zero (CC0) for unrestricted use

The dataset can be used for analyzing educational content in Russian, text classification tasks, and information retrieval systems. It's also valuable for examining trends in educational materials and document sharing practices in the Russian-speaking academic community. The inclusion of original files allows for in-depth analysis of various document formats and structures.