Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
mohitsha
/
Llama-2-7b-chat-hf-FP8-FNUZ-KV-AMMO
like
0
Text Generation
Transformers
Safetensors
llama
conversational
text-generation-inference
Inference Endpoints
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
No model card
New: Create and edit this model card directly on the website!
Contribute a Model Card
Downloads last month
2
Safetensors
Model size
6.74B params
Tensor type
FP16
·
Inference Examples
Text Generation
Inference API (serverless) is not available, repository is disabled.
Collection including
mohitsha/Llama-2-7b-chat-hf-FP8-FNUZ-KV-AMMO
FP8 KV Cache
Collection
Models with FP8 KV Cache Scales
•
6 items
•
Updated
Jul 4