Friedrich Marty

Smorty100

AI & ML interests

I'm most interested in content rerouting between LLM and VLLM agens for automation possibilities. Using templates for each agent which is then filled in by another agents inputs seems really useful.

Recent Activity

Organizations

None yet

Smorty100's activity

reacted to WENGSYX's post with πŸ˜” 4 days ago
view post
Post
1649
πŸ”¬ Exciting Research Breakthrough! πŸš€
We've developed a new AI research assistant LLMs trained through RL that can:
- Generate research ideas from reference literature
- Preview potential research methodologies
- Automatically draft research reports
- Transform experimental results directly into academic papers! πŸ“

See in -> WestlakeNLP/CycleResearcher-12B

Check out our free demo at http://ai-researcher.cn and experience the future of academic research workflows. 🌐

Proud to share that our work has been accepted as a Poster at ICLR 2025! πŸ† #AIResearch #AcademicInnovation #MachineLearning
reacted to nroggendorff's post with ❀️ 6 days ago
view post
Post
2779
We're using RLHF on diffusion models, right? Just making sure..
Β·
reacted to singhsidhukuldeep's post with πŸ‘ 6 days ago
view post
Post
6690
Exciting New Tool for Knowledge Graph Extraction from Plain Text!

I just came across a groundbreaking new tool called KGGen that's solving a major challenge in the AI world - the scarcity of high-quality knowledge graph data.

KGGen is an open-source Python package that leverages language models to extract knowledge graphs (KGs) from plain text. What makes it special is its innovative approach to clustering related entities, which significantly reduces sparsity in the extracted KGs.

The technical approach is fascinating:

1. KGGen uses a multi-stage process involving an LLM (GPT-4o in their implementation) to extract entities and relations from source text
2. It aggregates graphs across sources to reduce redundancy
3. Most importantly, it applies iterative LM-based clustering to refine the raw graph

The clustering stage is particularly innovative - it identifies which nodes and edges refer to the same underlying entities or concepts. This normalizes variations in tense, plurality, stemming, and capitalization (e.g., "labors" clustered with "labor").

The researchers from Stanford and University of Toronto also introduced MINE (Measure of Information in Nodes and Edges), the first benchmark for evaluating KG extractors. When tested against existing methods like OpenIE and GraphRAG, KGGen outperformed them by up to 18%.

For anyone working with knowledge graphs, RAG systems, or KG embeddings, this tool addresses the fundamental challenge of data scarcity that's been holding back progress in graph-based foundation models.

The package is available via pip install kg-gen, making it accessible to everyone. This could be a game-changer for knowledge graph applications!
reacted to Reality123b's post with πŸ˜” 20 days ago
reacted to lewtun's post with ❀️ 25 days ago
view post
Post
4787
Introducing OpenR1-Math-220k!

open-r1/OpenR1-Math-220k

The community has been busy distilling DeepSeek-R1 from inference providers, but we decided to have a go at doing it ourselves from scratch πŸ’ͺ

What’s new compared to existing reasoning datasets?

β™Ύ Based on AI-MO/NuminaMath-1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset.

🐳 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces.

πŸ“€ 512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day.

⏳ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that can’t be verified with a rules-based parser)

πŸ“Š We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset.

πŸ”Ž Read our blog post for all the nitty gritty details: https://huggingface.co./blog/open-r1/update-2
replied to nroggendorff's post 25 days ago
view reply

this is so real...

just like - adress the things u don't like, don't tell it to us through ur weird games.

it'd be fun if it were treated like in kindergarden where u throw a ball around and say a thing. but it's not somehow... but no, these activities are not as self-reflective as u would hope they'd be ;(

reacted to nroggendorff's post with πŸ‘ 25 days ago
view post
Post
2646
Dearest None-yet Team,

I couldn't help but notice that our productivity has room for improvement. To address this, we will be engaging in a company-wide morale-building activity designed to boost teamwork, enthusiasm, and *most importantly* results.

I know you're all as excited as I am for this fun and absolutely required initiative. Participation is not just encouraged, it's mandatory. Think of it as a team-bonding experience you never signed up for but will absolutely tolerate.

More details to follow, but for now, mark your calendars and prepare for an engaging experience that will definitely make us all better, stronger, and more synchronized, or at least give us something to talk about later.

Looking forward to seeing you all there!

Best,
Me
Β·
New activity in huggingchat/chat-ui 26 days ago

[MODELS] Discussion

621
#372 opened about 1 year ago by
victor
reacted to schuler's post with πŸ‘ 26 days ago
view post
Post
7226
πŸ“’ New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

πŸ”‘ Key Findings:
β€’ 77% parameter reduction.
β€’ Maintained model capabilities.
β€’ Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm
  • 2 replies
Β·
reacted to schuler's post with 🀯 27 days ago
view post
Post
7226
πŸ“’ New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

πŸ”‘ Key Findings:
β€’ 77% parameter reduction.
β€’ Maintained model capabilities.
β€’ Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm
  • 2 replies
Β·
replied to prithivMLmods's post about 1 month ago