John6666 (John Smith)

reacted to sayakpaul's post with 🔥🚀🤗 about 3 hours ago

Post

375

We have authored a post to go over the state of video generation in the Diffusers ecosystem 🧨

We cover the models supported, the knobs of optims our users can fire, fine-tuning, and more 🔥

5-6GBs for HunyuanVideo, sky is the limit 🌌 🤗
https://huggingface.co./blog/video_gen

reacted to clem's post with 🤗 about 3 hours ago

Post

504

AI is not a zero-sum game. Open-source AI is the tide that lifts all boats!

reacted to KnutJaegersberg's post with 👀 about 3 hours ago

Post

190

Evolution and The Knightian Blindspot of Machine Learning

The paper discusses machine learning's limitations in addressing Knightian Uncertainty (KU), highlighting the fragility of models like reinforcement learning (RL) in unpredictable, open-world environments. KU refers to uncertainty that can't be quantified or predicted, a challenge that RL fails to handle due to its reliance on fixed data distributions and limited formalisms.

### Key Approaches:

1. **Artificial Life (ALife):** Simulating diverse, evolving systems to generate adaptability, mimicking biological evolution's robustness to unpredictable environments.

2. **Open-Endedness:** Creating AI systems capable of continuous innovation and adaptation, drawing inspiration from human creativity and scientific discovery.

3. **Revising RL Formalisms:** Modifying reinforcement learning (RL) models to handle dynamic, open-world environments by integrating more flexible assumptions and evolutionary strategies.

These approaches aim to address ML’s limitations in real-world uncertainty and move toward more adaptive, general intelligence.

https://arxiv.org/abs/2501.13075

reacted to luigi12345's post with 👀 about 6 hours ago

Post

289

# Essential AutoGen Examples: Code Writing, File Operations & Agent Tools

1. **Code Writing with Function Calls & File Operations**
- [Documentation](https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_function_call_code_writing/)
- [Notebook](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_function_call_code_writing.ipynb)
- *Key Tools Shown*:
- list_files() - Directory listing
- read_file(filename) - File reading
- edit_file(file, start_line, end_line, new_code) - Precise code editing
- Code validation and syntax checking
- File backup and restore

2. **Auto Feedback from Code Execution**
- [Documentation](https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_auto_feedback_from_code_execution/)
- [Notebook](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_auto_feedback_from_code_execution.ipynb)
- *Key Tools Shown*:
- execute_code(code) with output capture
- Error analysis and auto-correction
- Test case generation
- Iterative debugging loop

3. **Async Operations & Parallel Execution**
- [Documentation](https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_function_call_async/)
- [Notebook](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_function_call_async.ipynb)
- *Key Tools Shown*:
- Async function registration
- Parallel agent operations
- Non-blocking file operations
- Task coordination

4. **LangChain Integration & Advanced Tools**
- [Colab](https://colab.research.google.com/github/sugarforever/LangChain-Advanced/blob/main/Integrations/AutoGen/autogen_langchain_uniswap_ai_agent.ipynb)
- *Key Tools Shown*:
- Vector store integration
- Document QA chains
- Multi-agent coordination
- Custom tool creation

Most relevant for file operations and code editing is Example #1, which demonstrates the core techniques used in autogenie.py for file manipulation and code editing using line numbers and replacement.

reacted to davanstrien's post with 👀 about 6 hours ago

Post

402

🌍 Big step for multilingual AI data!

The Hugging Face community has rated educational content in languages spoken by 1.6 billion people! New additions:
• Japanese
• Italian
• Old High German

Learn more and contribute: https://huggingface.co./blog/davanstrien/fineweb2-community

These ratings can help enhance training data for major world languages.

reacted to mmaguero's post with 👀 about 6 hours ago

Post

268

🚀 Multidimensional Affective Analysis for Guarani/Jopara! 🌎

This project explored affective computing for low-resource languages, focusing on emotion recognition, humor detection, and offensive language identification in Guarani and Jopara (a code-switching mix of Guarani and Spanish).

Highlights:
🧵 Corpora:
- Emotion Recognition
- Humor Detection
- Offensive Language Identification
💻 Base Models for Fine-Tuning (trained on Guarani Wiki):
- From scratch: BERT-based tiny, small, base and large models
- Continuously pre-trained models: Multilingual-BERT and BETO
📓 Baseline Notebooks:
- Fine-tuning BERT-based models
- NCRF++ models via GitHub

💡 Check the repo!
https://github.com/mmaguero/guarani-multi-affective-analysis

📖 Check out the publication here:
- https://digibug.ugr.es/handle/10481/98843
- https://link.springer.com/article/10.1007/s12559-023-10165-0

#NLP #AffectiveComputing #LowResourceLanguages #Guarani #Jopara #SentimentAnalysis #AIForAll

reacted to singhsidhukuldeep's post with 🚀 about 6 hours ago

Post

244

While everyone is buzzing about DeepSeek AI R1's groundbreaking open-source release, ByteDance has quietly launched something remarkable - Trae, an adaptive AI IDE that's redefining the development experience and unlike competitors like Cursor, it' completely FREE!

Trae is a sophisticated development environment built on Microsoft's VSCode foundation(with a nice skin on top), offering unlimited free access to both OpenAI's GPT-4o and Anthropic's Claude-3.5-Sonnet models.

Technical Highlights:
- Real-time AI pair programming with comprehensive codebase understanding
- Natural language commands for code generation and project-level development
- Intelligent task decomposition for automated planning and execution
- Seamless VS Code and Cursor configuration compatibility
- Multi-language support with specialized optimization for English and Chinese interfaces

Currently available for macOS (Windows version in development), Trae is distributed through ByteDance's Singapore subsidiary, Spring (SG) Pte. What sets it apart is its ability to handle mixed-language workflows and enhanced localization features that address common pain points in existing IDEs.

The AI assistant can generate code snippets, optimize logic, and even create entire projects from scratch through natural language prompts. It also features an innovative AI Chat system accessible via keyboard shortcuts for real-time coding assistance.

For developers looking to enhance their productivity without breaking the bank, Trae offers enterprise-grade AI capabilities completely free during its initial release. This move by ByteDance signals a significant shift in the AI IDE landscape, challenging established players with a robust, accessible alternative.

Try it at trae.ai

reacted to burtenshaw's post with 🚀 about 10 hours ago

Post

570

Manic few days in open source AI, with game changing development all over the place. Here's a round up of the resources:

- The science team at @huggingface reproduced and open source the seek r1. https://github.com/huggingface/open-r1
- @qwen released a series of models with 1 million token context! https://qwenlm.github.io/blog/qwen2.5-1m/
- SmolVLM got even smaller with completely new variants at 256m and 500m https://huggingface.co./blog/smolervlm

There's so much you could do with these developments. Especially combining them together into agentic applications or fine-tuning them on your use case.

reacted to AdinaY's post with 🚀 about 10 hours ago

Post

786

🔥So many exciting releases coming from the Chinese community this month!
zh-ai-community/2025-january-6786b054f492fb223591269e

LLMs:
✨ Qwen2.5 -1M by Alibaba
Qwen/qwen25-1m-679325716327ec07860530ba
✨ InternLM3-8B-Instruct by Shanghai AI Lab
internlm/internlm3-8b-instruct
✨ MiniMax-Text-01 by MiniMax AI
MiniMaxAI/MiniMax-Text-01
✨ RWKV-7 by BlinkDL -- RNN + Transformer 👀
BlinkDL/rwkv-7-world
✨ DeepSeek-R1 by DeepSeek -- THE ONE 🙌
https://huggingface.co./deepseek-ai
✨ Baichuan-M1-14B by Baichuan - Medical 🩺
baichuan-inc/Baichuan-M1-14B-Base
✨ Qwen2.5-Math-PRM by Alibaba - Math 🔢
Qwen/Qwen2.5-Math-PRM-7B

Code:
✨ Tare by Bytedance
https://trae.ai

TTS:
✨ T2A-01-HD by MiniMax AI
https://hailuo.ai/audio
✨ LLaSA by HKUST Audio
HKUSTAudio/Llasa-3B

MLLM:
✨ Kimi k1.5 by Moonshot AI
https://kimi.ai
✨ MiniCPM-o-2_6 by OpenBMB
openbmb/MiniCPM-o-2_6
✨ Sa2VA-4B by ByteDance
ByteDance/Sa2VA-4B
✨ VideoLLaMA 3 by Alibaba DAMO
DAMO-NLP-SG/videollama3-678cdda9281a0e32fe79af15
✨ LLaVA-Mini by Chinese Academy of Sciences
ICTNLP/llava-mini-llama-3.1-8b
✨Hunyuan-7B by Tencent
tencent/Hunyuan-7B-Instruct
✨ Hunyuan 3D 2.0 by Tencent
tencent/Hunyuan3D-2
✨MiniMax-VL-01 by MiniMax AI - A non transformer based VLM 👀
MiniMaxAI/MiniMax-VL-01

Agent:
✨ UI-TARS by Bytedance
bytedance-research/UI-TARS-7B-SFT
✨ GLM-PC by Zhipu AI
https://cogagent.aminer.cn

Dataset:
✨ Fineweb-Edu-Chinese by Opencsg
opencsg/Fineweb-Edu-Chinese-V2.1
✨ Multimodal_textbook by Alibaba
DAMO-NLP-SG/multimodal_textbook
✨ MME-Finance by Hithink AI

2 replies

·

reacted to davidberenstein1957's post with 👀 about 10 hours ago

Post

692

Let's uncover the post-training dataset from DeepSeek-R1 with Magpie!

Pass pre-query tokens <｜begin▁of▁sentence｜>User: , let the model generate the rest.

We can get realistic examples!

Gist: https://gist.github.com/davidberenstein1957/3f20046ce57395a6aba13f8b4e956b59

2 replies

·

reacted to crodri's post with 👀 about 10 hours ago

Post

405

At the Language Technologies Unit of the Barcelona Supercomputing Center, we are developing State of the Art Large Language and Voice Models through various national and international projects. It si an exciting time to be working in generative AI!
We are looking for bright and motivated individuals to help us achieve ambitious goals. Our latest opening for the Innovation group that develops powerful and socially useful applications for AI technology might be for you. Check it out here:
https://www.bsc.es/join-us/job-opportunities/3025lsltre2

1 reply

·

reacted to Delta-Vector's post with 👍 about 12 hours ago

Post

390

For anyone that enjoys Magnum models, I just dropped a 12B that is the first (or second?) stepping stone into Magnum V5

Delta-Vector/rei-12b-6795505005c4a94ebdfdeb39

reacted to AlexBodner's post with 👀 about 16 hours ago

Post

314

How does Deepseek R1 work?
I was wondering the same, so wrote a thread explaining from scratch everything that you need to know about it!
Breaking down the paper in: https://x.com/AlexBodner_/status/1883602267317927965

reacted to umarigan's post with 👀 about 16 hours ago

Post

369

** Extracting Reasoning Prompts with DeepSeek-R1: A Step Towards Better AI Reasoning **

Hi everyone! 👋

I’m excited to share a small but impactful project I’ve been working on, where I extracted **reasoning prompts** using the **DeepSeek-R1 model**. Reasoning prompts are a powerful way to understand how AI models arrive at their answers, and they can be used to train smaller, more efficient models to generate reasoning. Let me walk you through the process and explain why this is important.

---

#### **The Code: Extracting Reasoning Prompts**

Here’s the code I used to extract reasoning prompts from the openaccess-ai-collective/oo-gpt4-filtered dataset:

from tqdm import tqdm
import time

reasoning_data = []

for example in tqdm(ds, desc="answering"):
    try:
        response = client.chat.completions.create(
            model='deepseek-reasoner',  # Using DeepSeek-R1 for reasoning
            messages=[
                {"role": "system", "content": example['system_prompt']},
                {"role": "user", "content": example['question']},
            ],
            stream=False,
            max_tokens=4096,
            temperature=0.7,
        )
        
        answer = response.choices[0].message.content
        reasoning = response.choices[0].message.reasoning_content

        reasonng_example = {
            "id": example['id'],
            "question": example['question'],
            'answer': answer,
            'reasoning': reasoning,
        }

        reasoning_data.append(reasonng_example)
    except Exception as e:
        print(f"Error translating example: {e}")
        time.sleep(3)  # Wait for 3 seconds before continuing
        continue  # Skip the current example and move to the next one

data: umarigan/deepseek-r1-reasoning-prompts

reacted to mkurman's post with 👀 about 16 hours ago

Post

980

I’ve simplified things for the AI OS community!

Check out Qwen-2.5-14B-DeepSeek-R1-1M! This one's a cool blend of the latest Qwen 2.5 with 14 billion parameters and has a massive 1 million token context window. It also comes with the DeepSeek R1 version of the Qwen 2.5 14B base model.

Enjoy! 🚀

mkurman/Qwen2.5-14B-DeepSeek-R1-1M

reacted to nicolay-r's post with 👀 about 16 hours ago

Post

860

📢 For those who wish to apply DeepSeek-R1 for handling tabular / streaming data using schema of prompts (CoT), the OpenRouter AI hosts API for accessing:
https://openrouter.ai/deepseek/deepseek-r1

The no-string option to quick start with using DeepSeek-R1 includes three steps:
✅ OpenRouter provider: https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/open_router.py
✅ Bulk-chain for infering data: https://github.com/nicolay-r/bulk-chain
✅ Json Schema for Chain-of-Though reasoning (see screenshot 📷 below)

📺 below is a screenshot of how to quick start the demo, in which you can test your schema for LLM responses. It would ask to type all the parameters first for completing the requests (which is text within this example).

📃 To apply it for JSONL/CSV data, you can use --src shell parameter for passing the related file

⏳ As for time, OpenRouter finds me relatively slow with 30~40 seconds per request

Models:
deepseek-ai/DeepSeek-R1

reacted to fantos's post with 🚀🔥 about 16 hours ago

Post

1646

🚀 HuggingFace Spaces Ranking Tracker - Your Complete AI Trend Analytics!

Introducing the Spaces Ranking Tracker, a comprehensive analytics dashboard that tracks and analyzes every AI application in the HuggingFace ecosystem.

✨ Key Features:
• Real-time tracking of daily ranking changes over 30 days
• Detailed analysis of top 100 trending spaces
• User-based integrated score visualization
• One-click access to space details
• Interactive rank change graphs

📊 Dashboard Components:
1. Main Dashboard
- Daily rank trend graphs
- Top 20 creators' combined score chart
- Detailed space information cards
- Real-time trending score updates

2. Space Detailed Analysis
- Creation date, current rank, and trending score
- 30-day ranking history
- Direct space access
- Custom color coding for intuitive rank display

🎯 How to Use:
• Monitor latest AI community trends
• Track your project's performance
• Discover popular AI demos
• Analyze competing projects
• Follow AI ecosystem dynamics

3. Interactive Features
- Custom filtering options
- Sorting by various metrics
- Detailed performance statistics
- Comprehensive trending scores
- Historical data tracking

Stay on top of every movement in the HuggingFace ecosystem with daily ranking updates! 👉 Try it now!

🔗 Access Dashboard: fantos/Ranking-Tracker
#HuggingFace #AI #DataVisualization #TrendAnalysis #AITrends

John Smith PRO

AI & ML interests

Recent Activity

Organizations

John6666's activity