emin temiz PRO
AI & ML interests
Recent Activity
Articles
Organizations
etemiz's activity
Qwen team released QvQ, a large vision LM with reasoning š±
it outperforms proprietary VLMs on several benchmarks, comes with open weights and a demo!
Check them out ā¬ļø
Demo Qwen/QVQ-72B-preview
Model Qwen/QVQ-72B-Preview
Read more https://qwenlm.github.io/blog/qvq-72b-preview/
Congratulations @JustinLin610 and team!
Want to know about my experiments?
Who would be interested to join?
As I read more about it, it looks more ground breaking.
This, combined with "Training Large Language Models to Reason in a Continuous Latent Space" paper is pretty important imo.
The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:
>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.
Three-Component Architecture:
ā¢ Lightweight Local Encoder that converts bytes to patch representations
ā¢ Powerful Global Latent Transformer that processes patches
ā¢ Local Decoder that converts patches back to bytes
>> Technical Advantages
ā¢ Matches performance of Llama 3 at 8B parameters while being more efficient
ā¢ Superior handling of non-English languages and rare character sequences
ā¢ Remarkable 99.9% accuracy on spelling tasks
ā¢ Better scaling properties than token-based models
>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.
This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!
It is not ok to remove people from the equation however efficient the machines are. We can never be sure that the synthetic matches the original in terms of alignment and those further models and further synthetics can derail the whole thing.
That's the hard part. Careful analysis for a long time and the amount of people are benefiting from them and their friends can have some clues. If the guy's solutions work most of the time for many people, over the years, he may be eligible to get into a curated LLM.
Better curation is possible by emphasizing certain texts.
TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)
docs: https://huggingface.co./docs/hub/storage-limits
We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community š„
cc: @reach-vb @pierric @victor and the HF team
There is a lot of difference in opinions among open source models. Which models are closer to the solutions that will work for different situations? How far away are the models from the reality?
https://wikifreedia.xyz/based-llm-leaderboard/npub1nlk894teh248w2heuu0x8z6jjg2hyxkwdc8cxgrjtm9lnamlskcsghjm9c
So you always have to watch centralized AI but you never have to watch the local LLM.
Trying to download QwQ 32B GGUF. It disconnected like 30 times..
It may increase efficiency of HF by offloading traffic to users instead of you serving all the files.
Maybe you can also create torrents for popular files?
https://etemiz.substack.com/p/how-we-curate-and-how-it-aligns-ai
by more people i mean me :)
Introducing: AI Video Composer š„
huggingface-projects/ai-video-composer
Drag and drop your assets (images/videos/audios) to create any video you want using natural language!
It works by asking the model to output a valid FFMPEG and this can be quite complex but most of the time Qwen2.5-Coder-32B gets it right (that thing is a beast). It's an update of an old project made with GPT4 and it was almost impossible to make it work with open models back then (~1.5 years ago), but not anymore, let's go open weights š.
https://etemiz.substack.com/p/how-we-curate-and-how-it-aligns-ai
by more people i mean me :)