Where is the study?
Mads PRO
mhenrichsen
AI & ML interests
None yet
Recent Activity
replied to
singhsidhukuldeep's
post
14 days ago
Fascinating new research alert! Just read a groundbreaking paper on understanding Retrieval-Augmented Generation (RAG) systems and their performance factors.
Key insights from this comprehensive study:
>> Architecture Deep Dive
The researchers analyzed RAG systems across 6 datasets (3 code-related, 3 QA-focused) using multiple LLMs. Their investigation revealed critical insights into four key design factors:
Document Types Impact:
• Oracle documents (ground truth) aren't always optimal
• Distracting documents significantly degrade performance
• Surprisingly, irrelevant documents boost code generation by up to 15.6%
Retrieval Precision:
• Performance varies dramatically by task
• QA tasks need 20-100% retrieval recall
• Perfect retrieval still fails up to 12% of the time on previously correct instances
Document Selection:
• More documents ≠ better results
• Adding documents can cause errors on previously correct samples
• Performance degradation increases ~1% per 5 additional documents in code tasks
Prompt Engineering:
• Most advanced prompting techniques underperform simple zero-shot prompts
• Technique effectiveness varies significantly across models and tasks
• Complex prompts excel at difficult problems but struggle with simple ones
>> Technical Implementation
The study utilized:
• Multiple retrievers including BM25, dense retrievers, and specialized models
• Comprehensive corpus of 70,956 unique API documents
• Over 200,000 API calls and 1,000+ GPU hours of computation
• Sophisticated evaluation metrics tracking both correctness and system confidence
💡 Key takeaway: RAG system optimization requires careful balancing of multiple factors - there's no one-size-fits-all solution.
new activity
about 1 month ago
syvai/hviske-v2:Tegnsætning og store bogstaver
Organizations
mhenrichsen's activity
replied to
singhsidhukuldeep's
post
14 days ago
Totally unreleated, but please take this shit down.
https://huggingface.co./nesaorg/benchmark_v0
173M downloads. It's spam for their crypto scam.
@pierric
@victor
@reach-vb
@julien-c
Tegnsætning og store bogstaver
1
#2 opened about 1 month ago
by
RasmusKlett
Runtime error her på HuggingFace
2
#1 opened about 1 month ago
by
borup
Better than gpt-4o on what benchmark/dataset?
1
#1 opened about 2 months ago
by
mathiasn1
Mindre rettelser
1
#1 opened 2 months ago
by
KennethEnevoldsen
replied to
louisbrulenaudet's
post
4 months ago
Nice! How do you make the graph itself?
Awesome. Thanks @Waiplin
Cool cool. Is there any public facing API's that we can use to pull data about models? Could be nice to show total downloads or likes.
Does HF provide any code snippets to easily integrate into websites?
License?
4
#2 opened 4 months ago
by
mhenrichsen
How to convert the model to a gguf model?
3
#3 opened 9 months ago
by
pksorensen
Adding `safetensors` variant of this model
#1 opened 8 months ago
by
SFconvertbot
Generated text is garbled?
5
#53 opened 8 months ago
by
gbhall
Split by languages?
4
#7 opened 8 months ago
by
mhenrichsen