victor (Victor Mustar)

liked a model about 2 hours ago

deepseek-ai/DeepSeek-V3-Base

Updated about 3 hours ago • 112

liked a Space about 3 hours ago

Running

146

🌍

QVQ 72B Preview

liked a model 1 day ago

answerdotai/ModernBERT-base

Fill-Mask • Updated 6 days ago • 27.6k • 437

upvoted a paper 1 day ago

Parallelized Autoregressive Visual Generation

Paper • 2412.15119 • Published 6 days ago • 44

reacted to Kseniase's post with 👍 2 days ago

Post

2274

**15 Agentic Systems and Frameworks of 2024**

This year, we started our “AI Agents and Agentic Workflows” series (https://www.turingpost.com/t/AI-Agents) to explore everything about AI agents step by step: all the vocabulary, how they work, and how to build them.
The huge interest in this series and the large number of studies conducted on agents showed that it was one of the most popular and important themes of the year. In 2025, most likely, agents will reach new highs – we will be covering that for you. Now, let’s review the agentic systems that have emerged this year.

Here is a list of 15 agentic systems and frameworks of 2024:

1. GUI Agents: A Survey (2412.13501)

2. Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level (2411.03562)

3. The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (2408.06292)

4. MALT: Improving Reasoning with Multi-Agent LLM Training (2412.01928)

5. Agent S: An Open Agentic Framework that Uses Computers Like a Human (2410.08164)

6. Automated Design of Agentic Systems (2408.08435)

7. AgentInstruct: Toward Generative Teaching with Agentic Flows (2407.03502)

8. AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant (2410.18603)

9. WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents (2410.07484)

10. Generative Agent Simulations of 1,000 People (2411.10109)

11. DynaSaur: Large Language Agents Beyond Predefined Actions (2411.01747)

12. PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking (2410.12375)

13. Generative World Explorer (2411.11844)

14. Bel Esprit: Multi-Agent Framework for Building AI Model Pipelines (2412.14684)

15. AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions (2410.20424)

Thanks for reading Turing Post!
Subscribe to receive new posts straight into your inbox -> https://www.turingpost.com/subscribe

reacted to nroggendorff's post with 👀 2 days ago

Post

1262

Can we please do something about this? It makes everything I do so much harder, and because my local machine is so terrible, I am forced to test in production. This makes debugging so difficult.
nroggendorff/system-exit

cc @victor

1 reply

·

upvoted a collection 3 days ago

Vision Language Models

Collection

Grounding, chat • 4 items • Updated 6 days ago • 10

updated a model 3 days ago

strangerzonehf/Flux-Xmas-Isometric-Kit-LoRA

Text-to-Image • Updated 3 days ago • 173 • • 9

New activity in strangerzonehf/Flux-Xmas-Isometric-Kit-LoRA 3 days ago

first image open in gallery?

1

#1 opened 3 days ago by

victor

liked a Space 3 days ago

Running on Zero

599

🌍

Video Dubbing (SoniTranslate)

Video Dubbing with Open Source Projects

upvoted a paper 4 days ago

AniDoc: Animation Creation Made Easier

Paper • 2412.14173 • Published 7 days ago • 48

updated a Space 4 days ago

Running

9

📚

Dom To Semantic Markdown

liked 2 models 4 days ago

fofr/flux-xmas-sweater

Text-to-Image • Updated 4 days ago • 224 • • 3

IamCreateAI/Ruyi-Mini-7B

Image-to-Video • Updated about 8 hours ago • 12.8k • 458

upvoted a paper 5 days ago

Spectrum: Targeted Training on Signal to Noise Ratio

Paper • 2406.06623 • Published Jun 7 • 11

liked a dataset 5 days ago

O1-OPEN/OpenO1-SFT

Viewer • Updated 9 days ago • 77.7k • 2.12k • 261

upvoted a paper 5 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 6 days ago • 327

liked a model 6 days ago

Datou1111/Yoji_Shinkawa

Text-to-Image • Updated Sep 7 • 236 • • 13

reacted to anton-l's post with 🔥 6 days ago

Post

1964

Introducing 📐𝐅𝐢𝐧𝐞𝐌𝐚𝐭𝐡: the best public math pre-training dataset with 50B+ tokens!
HuggingFaceTB/finemath

Math remains challenging for LLMs and by training on FineMath we see considerable gains over other math datasets, especially on GSM8K and MATH.

We build the dataset by:
🛠️ carefully extracting math data from Common Crawl;
🔎 iteratively filtering and recalling high quality math pages using a classifier trained on synthetic annotations to identify math reasoning and deduction.

We conducted a series of ablations comparing the performance of Llama-3.2-3B-Base after continued pre-training on FineMath and observe notable gains compared to the baseline model and other public math datasets.

We hope this helps advance the performance of LLMs on math and reasoning! 🚀
We’re also releasing all the ablation models as well as the evaluation code.

HuggingFaceTB/finemath-6763fb8f71b6439b653482c2

reacted to m-ric's post with 🔥 6 days ago

Post

1629

After 6 years, BERT, the workhorse of encoder models, finally gets a replacement: 𝗪𝗲𝗹𝗰𝗼𝗺𝗲 𝗠𝗼𝗱𝗲𝗿𝗻𝗕𝗘𝗥𝗧! 🤗

We talk a lot about ✨Generative AI✨, meaning "Decoder version of the Transformers architecture", but this is only one of the ways to build LLMs: encoder models, that turn a sentence in a vector, are maybe even more widely used in industry than generative models.

The workhorse for this category has been BERT since its release in 2018 (that's prehistory for LLMs).

It's not a fancy 100B parameters supermodel (just a few hundred millions), but it's an excellent workhorse, kind of a Honda Civic for LLMs.

Many applications use BERT-family models - the top models in this category cumulate millions of downloads on the Hub.

➡️ Now a collaboration between Answer.AI and LightOn just introduced BERT's replacement: ModernBERT.

𝗧𝗟;𝗗𝗥:
🏛️ Architecture changes:
⇒ First, standard modernizations:
- Rotary positional embeddings (RoPE)
- Replace GeLU with GeGLU,
- Use Flash Attention 2
✨ The team also introduced innovative techniques like alternating attention instead of full attention, and sequence packing to get rid of padding overhead.

🥇 As a result, the model tops the game of encoder models:
It beats previous standard DeBERTaV3 for 1/5th the memory footprint, and runs 4x faster!

Read the blog post 👉 https://huggingface.co./blog/modernbert

1 reply

·

Victor Mustar PRO

AI & ML interests

Recent Activity

Articles

Inference for PROs

Organizations

victor's activity

deepseek-ai/DeepSeek-V3-Base

QVQ 72B Preview

answerdotai/ModernBERT-base

Parallelized Autoregressive Visual Generation

Vision Language Models

strangerzonehf/Flux-Xmas-Isometric-Kit-LoRA

first image open in gallery?

Video Dubbing (SoniTranslate)

AniDoc: Animation Creation Made Easier

Dom To Semantic Markdown

fofr/flux-xmas-sweater

IamCreateAI/Ruyi-Mini-7B

Spectrum: Targeted Training on Signal to Noise Ratio

O1-OPEN/OpenO1-SFT

Qwen2.5 Technical Report

Datou1111/Yoji_Shinkawa