Li Tan PRO
tanliboy
AI & ML interests
None yet
Recent Activity
new activity
about 2 months ago
rombodawg/Rombos-LLM-V2.5-Qwen-72b:what is your "continuous finetuning"
new activity
about 2 months ago
google/gemma-2-9b-it:Batch Inference causes degraded performance
Organizations
tanliboy's activity
what is your "continuous finetuning"
7
#2 opened 3 months ago
by
MaziyarPanahi
Batch Inference causes degraded performance
3
#43 opened 4 months ago
by
tanliboy
Scorecard on popular benchmarks
2
#2 opened 3 months ago
by
tanliboy
Phi-2-Instruct-APO: aligned with Anchored Preference Optimization
16
#3 opened 3 months ago
by
rasyosef
Preference Alignment
4
#6 opened 3 months ago
by
tanliboy
Text Classification with LLMs
7
#30 opened 4 months ago
by
dss107
IFEVAL drop
#16 opened 3 months ago
by
tanliboy
bfloat16 vs. float32
#34 opened 3 months ago
by
tanliboy
Qwen 2.5 1.5B retrain?
4
#12 opened 3 months ago
by
tomaarsen
GSM8K Evaluation Result: 84.5 vs. 76.95
17
#81 opened 5 months ago
by
tanliboy
Finetuning script using HuggingFace (No llama-factory)
19
#32 opened 4 months ago
by
2U1
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
8
#120 opened 4 months ago
by
erildo
Have you deleted your GitHub page?
7
#10 opened 4 months ago
by
xwzy6
Sliding window vs. Global Attention
5
#41 opened 4 months ago
by
tanliboy
Gemma2-2b training uses much more momory!
2
#23 opened 4 months ago
by
bubbleseller
GemmaSdpaAttention vs GemmaAttention
2
#71 opened 4 months ago
by
canqin001
Fix Llama 3.1 Chat Template to Properly Handle add_generation_prompt
9
#26 opened 4 months ago
by
Tostino
🍭 Fine-tuning support for Qwen2-VL-7B-Instruct
5
#1 opened 4 months ago
by
study-hjt
Evaluation Result
#15 opened 4 months ago
by
tanliboy
How is this dataset supposed to be used to evaluate the model?
4
#1 opened 4 months ago
by
realdanielbyrne