Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
1
1
Xiaoxia Wu
xiaoxiawu123
Follow
21world's profile picture
1 follower
·
0 following
xwuShirley
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
17 days ago
APOLLO: SGD-like Memory, AdamW-level Performance
View all activity
Organizations
xiaoxiawu123
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
upvoted
a
paper
17 days ago
APOLLO: SGD-like Memory, AdamW-level Performance
Paper
•
2412.05270
•
Published
19 days ago
•
38
liked
a model
3 months ago
meta-llama/Llama-3.2-90B-Vision-Instruct
Image-Text-to-Text
•
Updated
21 days ago
•
72.8k
•
308
authored
a paper
3 months ago
GRIN: GRadient-INformed MoE
Paper
•
2409.12136
•
Published
Sep 18
•
15
New activity in
meta-llama/Llama-3.1-405B-Instruct
5 months ago
why "num_key_value_heads": 16,
#14 opened 5 months ago by
xiaoxiawu123
Load more