Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

bartowski

posted an update 1 day ago

Post

4459

Access requests enabled for latest GLM models

While a fix is being implemented (https://github.com/ggml-org/llama.cpp/pull/12957) I want to leave the models up for visibility and continued discussion, but want to prevent accidental downloads of known broken models (even though there are settings that could fix it at runtime for now)

With this goal, I've enabled access requests. I don't really want your data, so I'm sorry that I don't think there's a way around that? But that's what I'm gonna do for now, and I'll remove the gate when a fix is up and verified and I have a chance to re-convert and quantize!

Hope you don't mind in the mean time :D

openfree

posted an update 3 days ago

Post

6588

Agentic AI Era: Analyzing MCP vs MCO 🚀

Hello everyone!
With the rapid advancement of AI agent technology, two architectures have come into the spotlight: MCP (Model Context Protocol) and MCO (Model Context Open-json). Today, we’ll introduce the key features and differences of these two approaches.

VIDraft/Agentic-AI-CHAT

MCP: The Traditional Approach 🏛️
Centralized Function Registry: All functions are hardcoded into the core system.

Static Function Definitions & Tight Coupling: New features require changes to the core application code, limiting scalability.

Monolithic Design: Complex deployment and version management can cause a single error to affect the whole system.

Code Example:
'''py
FUNCTION_REGISTRY = {
"existing_function": existing_function,
"new_function": new_function # Adding a new function
}
'''

MCO: A Revolutionary Approach 🆕
JSON-based Function Definitions: Function details are stored in external JSON files, enabling dynamic module loading.

Loose Coupling & Microservices: Each function can be developed, tested, and deployed as an independent module.

Flexible Scalability: Add new features by simply updating the JSON and module files, without modifying the core system.

JSON Example:
[
{
"name": "analyze_sentiment",
"module_path": "nlp_tools",
"func_name_in_module": "sentiment_analysis",
"example_usage": "analyze_sentiment(text=\"I love this product!\")"
}
]

Why MCO? 💡
Enhanced Development Efficiency: Developers can focus on their own modules with independent testing and deployment.

Simplified Error Management: Errors remain confined within their modules, enabling quick hotfixes.

Future-Proofing: With potential features like remote function calls (RPC), access control, auto-documentation, and a function marketplace, MCO paves the way for rapid innovation.

Practical Use & Community 🤝
The MCO implementation has been successfully tested on Vidraft’s LLM (based on Google Gemma-3)

hesamation

posted an update 1 day ago

Post

2003

this paper has been blowing up
they train an open-source multimodal LLM (InternVL3) that can compete with GPT-4o and Claude 3.5 Sonnet by:
> training text and vision on a single stage
> a novel V2PE positional encoding
> SFT & mixed preference optimization
Paper: InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models (2504.10479)
> test-time scaling

neph1

posted an update about 20 hours ago

Post

1173

I know Hunyuan Video is yesterday's jam, but in case you're looking for some cinematic LoRA's (and don't like civitai for some reason), I've uploaded my most popular ones to hf. They are:
1980s fantasy: neph1/1980s_Fantasy_Movies_Hunyuan_Video_Lora
1950s scifi: neph1/50s_scifi_hunyuan_video_lora
1920s horror: neph1/1920s_horror_hunyuan_video_lora

Yehor

posted an update about 18 hours ago

Post

1191

Made a workable program that uses IREE runtime using Rust to inference wav2vec2-bert model for Automatic Speech Recognition.

prithivMLmods

posted an update about 22 hours ago

Post

1193

Try out the demo for Multimodal OCR featuring the implementation of models including RolmOCR and Qwen2VL OCR. The use case showcases image-text-to-text conversion and video understanding support for the RolmOCR model ! 🚀

🤗Multimodal OCR Space : prithivMLmods/Multimodal-OCR

📦The models implemented in this Space are:
+ RolmOCR : reducto/RolmOCR
+ Qwen2VL OCR : prithivMLmods/Qwen2-VL-OCR-2B-Instruct [ or ]
+ Qwen2VL OCR2 : prithivMLmods/Qwen2-VL-OCR2-2B-Instruct

Qwen2VL OCR supports only image-text-to-text in the space.

thomwolf

posted an update 3 days ago

Post

4022

If you've followed the progress of robotics in the past 18 months, you've likely noticed how robotics is increasingly becoming the next frontier that AI will unlock.

At Hugging Face—in robotics and across all AI fields—we believe in a future where AI and robots are open-source, transparent, and affordable; community-built and safe; hackable and fun. We've had so much mutual understanding and passion working with the Pollen Robotics team over the past year that we decided to join forces!

You can already find our open-source humanoid robot platform Reachy 2 on the Pollen website and the Pollen community and people here on the hub at

pollen-robotics

We're so excited to build and share more open-source robots with the world in the coming months!

1 reply

AdinaY

posted an update 3 days ago

Post

3046

🔥 New reasoning models from the Chinese community, by Skywork 天工-昆仑万维

Skywork/skywork-or1-67fa1bcb41b436ef2def76b9

✨Skywork OR1-Math-7B > Optimized for math reasoning
✨Skywork-OR1-7B-preview > Excels in math & coding
✨Skywork-OR1-32B-preview > Matches Deepseek-R1 on math (AIME24/25) and coding (LiveCodeBench)

Released under the Apache 2.0 license 🥳
Final version coming in 2 weeks!

JunhaoZhuang

posted an update about 5 hours ago

Post

532

We are excited to announce the release of our paper, "Cobra: Efficient Line Art COlorization with BRoAder References," along with the official code! Cobra is a novel efficient long-context fine-grained ID preservation framework for line art colorization, achieving high precision, efficiency, and flexible usability for comic colorization. By effectively integrating extensive contextual references, it transforms black-and-white line art into vibrant illustrations.

We invite you to explore Cobra and share your feedback! You can access the paper and code via the following links: [PDF](https://arxiv.org/abs/2504.12240) and [Project page](https://zhuang2002.github.io/Cobra/). We eagerly anticipate your engagement and support!

Thank you for your interest!

2 replies

gavinkhung

posted an update about 10 hours ago

Post

606

Want to see machine learning algorithms training?
I made a website: https://gavinkhung.github.io/machine-learning-visualized/

The website implements, visualizes, and mathematically derives machine learning algorithms from first-principles.

Feel free to contribute to this open-source resource: https://github.com/gavinkhung/machine-learning-visualized

1 reply

Recently active users