HF1BitLLM
/

Llama3-8B-1.58-100B-tokens

Text Generation

text-generation-inference

Inference Endpoints

8-bit precision

Model card Files Files and versions Community

Resources

View closed (5)

added missing imports

#12 opened 4 months ago by

Triton error while running demo code

#11 opened 5 months ago by

Slower than standard Llama 8b?

#10 opened 5 months ago by

I found some errors when building on a rpi 5

#9 opened 5 months ago by

You can try to convert DeepSeek-V2.5 or Llama-3.1-Nemotron-70B-Instruct-HF?

#8 opened 5 months ago by

Finetuning this model

#7 opened 5 months ago by

GGUF conversion

#3 opened 6 months ago by