AI & ML interests

Dataset Viber is your chill repo for data collection, annotation and vibe checks.

dataset-viber's activity

davidberenstein1957ย 
posted an update 4 days ago
davidberenstein1957ย 
posted an update 5 days ago
view post
Post
4083
๐ŸฅŠ Epic Agent Framework Showdown! Available today!

๐Ÿ”ต In the blue corner, the versatile challenger with a proven track record of knowledge retrieval: LlamaIndex!

๐Ÿ›‘ In the red corner, the defender, weighing in with lightweight efficiency: Hugging Face smolagents!

๐Ÿ”— URL: https://huggingface.co./agents-course

We just published the LlamaIndex unit for the agents course, and it is set to offer a great contrast between the smolagents unit by looking at

- What makes llama-index stand-out
- How the LlamaHub is used for integrations
- Creating QueryEngine components
- Using agents and tools
- Agentic and multi-agent workflows

The team has been working flat-out on this for a few weeks. Supported by Logan Markewich and Laurie Voss over at LlamaIndex.

Who won? You decide!
davidberenstein1957ย 
posted an update 5 days ago
view post
Post
2920
๐Ÿซธ New release to push vector search to the Hub with vicinity and work with any serialisable objects.

๐Ÿง‘โ€๐Ÿซ KNN, HNSW, USEARCH, ANNOY, PYNNDESCENT, FAISS, and VOYAGER.

๐Ÿ”— Example Repo: minishlab/my-vicinity-repo
davidberenstein1957ย 
posted an update 26 days ago
view post
Post
3280
๐Ÿš€ Find banger tools for your smolagents!

I created the Tools gallery, which makes tools specifically developed by/for smolagents searchable and visible. This will help with:
- inspiration
- best practices
- finding cool tools

Space: davidberenstein1957/smolagents-and-tools
  • 1 reply
ยท
davidberenstein1957ย 
posted an update 27 days ago
davidberenstein1957ย 
posted an update about 1 month ago
davidberenstein1957ย 
posted an update about 1 month ago
davidberenstein1957ย 
posted an update about 1 month ago
davidberenstein1957ย 
posted an update about 1 month ago
davidberenstein1957ย 
posted an update about 1 month ago
davidberenstein1957ย 
posted an update about 2 months ago
davidberenstein1957ย 
posted an update about 2 months ago
davidberenstein1957ย 
posted an update about 2 months ago
davidberenstein1957ย 
posted an update 2 months ago
davidberenstein1957ย 
posted an update 2 months ago
davidberenstein1957ย 
posted an update 3 months ago
davidberenstein1957ย 
posted an update 3 months ago
view post
Post
4234
Introducing the Synthetic Data Generator, a user-friendly application that takes a no-code approach to creating custom datasets with Large Language Models (LLMs). The best part: A simple step-by-step process, making dataset creation a non-technical breeze, allowing anyone to create datasets and models in minutes and without any code.

Blog: https://huggingface.co./blog/synthetic-data-generator
Space: argilla/synthetic-data-generator
  • 4 replies
ยท
davidberenstein1957ย 
posted an update 3 months ago
view post
Post
2087
Open Preference Dataset for Text-to-Image Generation by the ๐Ÿค— Community

Open Image Preferences is an Apache 2.0 licensed dataset for text-to-image generation. This dataset contains 10K text-to-image preference pairs across common image generation categories, while using different model families and varying prompt complexities.

https://huggingface.co./blog/image-preferences
davidberenstein1957ย 
posted an update 3 months ago
view post
Post
1193
This is amazing for cheap models fine-tunes without the hassle of actual deployment! TIL: LoRA fine-tunes for models on the Hub can directly be used for inference!


davidberenstein1957ย 
posted an update 3 months ago
view post
Post
3474
The Data Is Better Together community is set to release the first Apache 2 licensed image preference dataset!

Great work and let's give this a final push :)

@aashish1904 congrats on your month of HF pro. There is more to win during this sprint!

@aashish1904 @AnyaDesdein @davidberenstein1957 @Malalatiana @beta3 @fffiloni @munish0838 @Reza2kn @bbunzeck @Creazycreator @andrei-saceleanu @jafhaponiuk @rca-etl @kf120 @burtenshaw @mmhamdy @grib0ed0v @Doopus @AnyaDes @ttkap @Xceron @Lewox @davanstrien @Azazelle @adirik @Ashish08 @AntonVic @kenantang @sdiazlor @g-ronimo @dennis-rall @prithivMLmods @girtss3 @flozi00 @WaveCut @Taylor658 @Wildminder @Sara9999 @phaelishall @sararob @dvilasuero @pgabrys @plaguss @CDS899 @timajwilliams @rudzinskimaciej @pavel-ai @aggr8 @ignacioct @MouseAI @Leeps @MaksKul @NicolasDmln @Muinez @kusht55 @caiolang @Jakub-Brand24 @loamy @Demijan @eliab96 @Viewegger @JosephCatrambone @p1atdev @mrshu @o639 @Targezed @Aviv-anthonnyolime @thliang01 @Ahmed-Amine @glards @pranaykoppula @nataliaElv @MaPirlet @alvarobartt @gabrielmbmb @zlicastro @Jaydip @Chouettecheveche @lilcheaty @ruyrdiaz @robintema @fdaudens @ggcristian @a-r-r-o-w @pates @joheras @stopsatgreen @bezo97 @chachi902 @iamyann @liamcripwell @dmb23 @korbih @anonymous7743 @akbdx18 @OVAWARE @severo @akontra @lichorosario @lhoestq @SebastianBodza @Vishnou @ameerazam08 @appoose @Mukei @mearco @joaquincabezas @Fizzarolli @thomastraum @igortopolski @OxxoCodes @patrickfleith @asoria @bn22 @sitammeur @Krodolf @bergr7f @Sbxxn @wietsevenema @sugatoray @Iamladi @MikeTrizna @feveromo @mokady @Bolero @prath @Dowwie @kfahn @decodingchris @alili2050 @RahulRaman @yzimmermann @Ameeeee @ecyht2 @MattMC001 @hemanthkumarak @Thegorgibus @akos2 @LawRun @ramithuh @SuperMuel @sjans @peterizsak @mosama @Eyel @mtr3 @cfahlgren1 @legentil @clem @Citaman @Aurelien-Morgan @AntoineBourgois @TotoB12 @Stanmey @osanseviero @multimodalart @maxiw @ariG23498 @ngk89 @femboysLover @dvs @tacohiddink @blanchon @DavidJimenez
  • 1 reply
ยท