Barun's picture

29

Barun

bapatra

·

codedecde

AI & ML interests

Natural Language Processing, Large Model Scaling, Alignment research, multimodality

Organizations

bapatra's activity

New activity in microsoft/Phi-3-small-128k-instruct 4 months ago

Move flash_attn assert from init into calling func

#32 opened 4 months ago by

New activity in microsoft/Phi-3-small-8k-instruct 6 months ago

auto_map in config.json doesn't contain Phi3SmallForSequenceClassification

#13 opened 7 months ago by

New activity in microsoft/Phi-3-mini-128k-instruct 6 months ago

Add the classifiers to the auto_map

#76 opened 6 months ago by

New activity in microsoft/Phi-3-small-8k-instruct 7 months ago

Enabled the AutoModelForSequenceClassification in the auto_map

#22 opened 7 months ago by

Ensure the query_states and key_states remain in bf16

#21 opened 7 months ago by

Keep getting AssertionError: Flash Attention is not available when load the model

#7 opened 7 months ago by

Complete-your-profile

Phi 3 small crashing error

#12 opened 7 months ago by

Crash in Fine-tuning

#14 opened 7 months ago by

how should data be packed?

#16 opened 7 months ago by

What pad token should I use for fine tuning?

#10 opened 7 months ago by

faizsameerahmed96

Shared memory error

#15 opened 7 months ago by

Update tokenization_phi3_small.py

#18 opened 7 months ago by

New activity in microsoft/Phi-3-small-128k-instruct 7 months ago

Update tokenization_phi3_small.py

#14 opened 7 months ago by

New activity in microsoft/Phi-3-small-8k-instruct 7 months ago

RuntimeError: FlashAttention only support fp16 and bf16 data type during fine tuning.

#11 opened 7 months ago by

faizsameerahmed96

New activity in microsoft/Phi-3-small-128k-instruct 7 months ago

Where can we download the phi-3 small ?

#11 opened 7 months ago by

DDd

#12 opened 7 months ago by

New activity in microsoft/Phi-3-small-8k-instruct 7 months ago

Why a different architecture from mini and medium?

#5 opened 7 months ago by

New activity in microsoft/Phi-3-small-128k-instruct 7 months ago

Target_module of this phi-3-small model

#3 opened 7 months ago by

flash Attention Error while inference

#7 opened 7 months ago by

New activity in microsoft/Phi-3-small-8k-instruct 7 months ago

Is it possible that this is a small model of GPT-3.5?

#6 opened 7 months ago by