LLM from Scratch

community

https://github.com/ayulockin/llm_scratch

Activity Feed

AI & ML interests

Large Language Models

Recent Activity

ayut updated a dataset 9 days ago

llm-scratch/wmt14-de-en-split

ariG23498 updated a dataset 12 days ago

llm-scratch/wmt14-de-en-split

ariG23498 published a dataset 12 days ago

llm-scratch/wmt14-de-en-split

View all activity

llm-scratch's activity

ayut

updated a dataset 9 days ago

llm-scratch/wmt14-de-en-split

Viewer • Updated 9 days ago • 4.51M • 128

ariG23498

updated a dataset 12 days ago

llm-scratch/wmt14-de-en-split

Viewer • Updated 9 days ago • 4.51M • 128

ariG23498

published a dataset 12 days ago

llm-scratch/wmt14-de-en-split

Viewer • Updated 9 days ago • 4.51M • 128

ariG23498

posted an update about 2 months ago

Post

2198

Tried my hand at simplifying the derivations of Direct Preference Optimization.

I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.

Blog: https://huggingface.co./blog/ariG23498/rlhf-to-dpo

ariG23498

posted an update about 2 months ago

Post

1945

Timm ❤️ Transformers

Wtih the latest version of transformers you can now use any timm model with the familiar transformers API.

Blog Post: https://huggingface.co./blog/timm-transformers
Repository with examples: https://github.com/ariG23498/timm-wrapper-examples
Collection: ariG23498/timmwrapper-6777b85f1e8d085d3f1374a1

ariG23498

updated a Space 3 months ago

README

🚀

Understanding LLMs from scratch

ariG23498

posted an update 3 months ago

Post

1424

We are blessed with another iteration of Pali Gemma. Google launches PaliGemma 2.

google/paligemma-2-release-67500e1e1dbfdd4dee27ba48

merve/paligemma2-vqav2

ariG23498

posted an update 4 months ago

Post

2953

Qwen/qwen25-66e81a666513e518adb90d9e

Qwen/Qwen2.5-Coder-Artifacts

Qwen/Qwen2.5-Coder-demo

ariG23498

posted an update 5 months ago

Post

1598

Cohere drops two new multilingual models!

CohereForAI/aya-expanse-8b
CohereForAI/aya-expanse-32b

Try them out here

CohereForAI/aya_expanse

ariG23498

posted an update 7 months ago

Post

1626

You can now use DoRA for your embedding layers!

PR: https://github.com/huggingface/peft/pull/2006

I have documented my journey of this specific PR in a blog post for everyone to read. The highlight of the PR was when the first author of DoRA reviewed my code.

Blog Post: https://huggingface.co./blog/ariG23498/peft-dora

Huge thanks to @BenjaminB for all the help I needed.

ariG23498

authored a paper over 1 year ago

G-SimCLR : Self-Supervised Contrastive Learning with Guided Projection via Pseudo Labelling

Paper • 2009.12007 • Published Sep 25, 2020

AI & ML interests

Recent Activity

Team members 2

llm-scratch's activity

README