Friedrich Marty

Smorty100

https://gitlab.com/users/Marty_Friedrich/projects

AI & ML interests

I'm most interested in content rerouting between LLM and VLLM agens for automation possibilities. Using templates for each agent which is then filled in by another agents inputs seems really useful.

Recent Activity

liked a model 4 days ago

Qwen/QwQ-32B

reacted to WENGSYX's post with 😔 4 days ago

🔬 Exciting Research Breakthrough! 🚀 We've developed a new AI research assistant LLMs trained through RL that can: - Generate research ideas from reference literature - Preview potential research methodologies - Automatically draft research reports - Transform experimental results directly into academic papers! 📝 See in -> https://huggingface.co./WestlakeNLP/CycleResearcher-12B Check out our free demo at http://ai-researcher.cn and experience the future of academic research workflows. 🌐 Proud to share that our work has been accepted as a Poster at ICLR 2025! 🏆 #AIResearch #AcademicInnovation #MachineLearning

reacted to nroggendorff's post with ❤️ 6 days ago

We're using RLHF on diffusion models, right? Just making sure..

View all activity

Organizations

None yet

Smorty100's activity

liked a model 4 days ago

Qwen/QwQ-32B

Text Generation • Updated 2 days ago • 103k • • 1.66k

reacted to WENGSYX's post with 😔 4 days ago

Post

1649

🔬 Exciting Research Breakthrough! 🚀
We've developed a new AI research assistant LLMs trained through RL that can:
- Generate research ideas from reference literature
- Preview potential research methodologies
- Automatically draft research reports
- Transform experimental results directly into academic papers! 📝

See in -> WestlakeNLP/CycleResearcher-12B

Check out our free demo at http://ai-researcher.cn and experience the future of academic research workflows. 🌐

Proud to share that our work has been accepted as a Poster at ICLR 2025! 🏆 #AIResearch #AcademicInnovation #MachineLearning

reacted to nroggendorff's post with ❤️ 6 days ago

Post

2779

We're using RLHF on diffusion models, right? Just making sure..

4 replies

reacted to singhsidhukuldeep's post with 👍 6 days ago

Post

6690

Exciting New Tool for Knowledge Graph Extraction from Plain Text!

I just came across a groundbreaking new tool called KGGen that's solving a major challenge in the AI world - the scarcity of high-quality knowledge graph data.

KGGen is an open-source Python package that leverages language models to extract knowledge graphs (KGs) from plain text. What makes it special is its innovative approach to clustering related entities, which significantly reduces sparsity in the extracted KGs.

The technical approach is fascinating:

1. KGGen uses a multi-stage process involving an LLM (GPT-4o in their implementation) to extract entities and relations from source text
2. It aggregates graphs across sources to reduce redundancy
3. Most importantly, it applies iterative LM-based clustering to refine the raw graph

The clustering stage is particularly innovative - it identifies which nodes and edges refer to the same underlying entities or concepts. This normalizes variations in tense, plurality, stemming, and capitalization (e.g., "labors" clustered with "labor").

The researchers from Stanford and University of Toronto also introduced MINE (Measure of Information in Nodes and Edges), the first benchmark for evaluating KG extractors. When tested against existing methods like OpenIE and GraphRAG, KGGen outperformed them by up to 18%.

For anyone working with knowledge graphs, RAG systems, or KG embeddings, this tool addresses the fundamental challenge of data scarcity that's been holding back progress in graph-based foundation models.

The package is available via pip install kg-gen, making it accessible to everyone. This could be a game-changer for knowledge graph applications!

liked a model 6 days ago

GSAI-ML/LLaDA-8B-Instruct

Text Generation • Updated 11 days ago • 18.6k • 189

New activity in huggingchat/chat-ui 11 days ago

Deepseek r1 32b model is reasoning less and often answering without accuracy

#673 opened about 1 month ago by

rishadsojon

New activity in mradermacher/Rombo-LLM-V3.0-Qwen-32b-i1-GGUF 15 days ago

Please make i1 quants of my latest 72b model

#1 opened 17 days ago by

rombodawg

New activity in perplexity-ai/r1-1776 16 days ago

This is not "uncensored". This is just anti-china.

#160 opened 16 days ago by

Smorty100

liked a model 17 days ago

tomg-group-umd/huginn-0125

Text Generation • Updated 14 days ago • 8.95k • 242

reacted to Reality123b's post with 😔 20 days ago

Post

2205

https://huggingface.co./posts/Reality123b/533143502736808
Since many of you upvoted that post, I'm open-sourcing this on 19th February 2025.

I don't know, but, this may be the "smartest AI on earth". im not totally sure.
also, i need some kind of help with the UI coz i suck at that.

updated a Space 25 days ago

First Agent Template

⚡

Get current time in any timezone

reacted to lewtun's post with ❤️ 25 days ago

Post

4787

Introducing OpenR1-Math-220k!

open-r1/OpenR1-Math-220k

The community has been busy distilling DeepSeek-R1 from inference providers, but we decided to have a go at doing it ourselves from scratch 💪

What’s new compared to existing reasoning datasets?

♾ Based on AI-MO/NuminaMath-1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset.

🐳 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces.

📀 512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day.

⏳ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that can’t be verified with a rules-based parser)

📊 We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset.

🔎 Read our blog post for all the nitty gritty details: https://huggingface.co./blog/open-r1/update-2

replied to nroggendorff's post 25 days ago

this is so real...

just like - adress the things u don't like, don't tell it to us through ur weird games.

it'd be fun if it were treated like in kindergarden where u throw a ball around and say a thing. but it's not somehow... but no, these activities are not as self-reflective as u would hope they'd be ;(

reacted to nroggendorff's post with 👍 25 days ago

Post

2646

Dearest None-yet Team,

I couldn't help but notice that our productivity has room for improvement. To address this, we will be engaging in a company-wide morale-building activity designed to boost teamwork, enthusiasm, and *most importantly* results.

I know you're all as excited as I am for this fun and absolutely required initiative. Participation is not just encouraged, it's mandatory. Think of it as a team-bonding experience you never signed up for but will absolutely tolerate.

More details to follow, but for now, mark your calendars and prepare for an engaging experience that will definitely make us all better, stronger, and more synchronized, or at least give us something to talk about later.

Looking forward to seeing you all there!

Best,
Me

4 replies

New activity in huggingchat/chat-ui 26 days ago

[MODELS] Discussion

621

#372 opened about 1 year ago by

victor

reacted to schuler's post with 👍 26 days ago

Post

7226

📢 New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

🔑 Key Findings:
• 77% parameter reduction.
• Maintained model capabilities.
• Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm

2 replies

reacted to schuler's post with 🤯 27 days ago

Post

7226

2 replies

liked a model 30 days ago

simplescaling/s1-32B

Text Generation • Updated 11 days ago • 17.5k • 286

replied to prithivMLmods's post about 1 month ago

awww <3

upvoted a collection about 1 month ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 8 items • Updated 13 days ago • 389