unsloth
/

Meta-Llama-3.1-70B-Instruct-bnb-4bit

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Finetune Llama 3.1, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth!

We have a free Google Colab Tesla T4 notebook for Llama 3.1 (8B) here: https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRnrIOjZAuxp?usp=sharing

✨ Finetune for Free

All notebooks are beginner friendly! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face.

Unsloth supports	Free Notebooks	Performance	Memory use
Llama-3.2 (3B)	▶️ Start on Colab	2.4x faster	58% less
Llama-3.2 (11B vision)	▶️ Start on Colab	2x faster	60% less
Llama-3.1 (8B)	▶️ Start on Colab	2.4x faster	58% less
Qwen2 VL (7B)	▶️ Start on Colab	1.8x faster	60% less
Qwen2.5 (7B)	▶️ Start on Colab	2x faster	60% less
Phi-3.5 (mini)	▶️ Start on Colab	2x faster	50% less
Gemma 2 (9B)	▶️ Start on Colab	2.4x faster	58% less
Mistral (7B)	▶️ Start on Colab	2.2x faster	62% less
DPO - Zephyr	▶️ Start on Colab	1.9x faster	19% less

This conversational notebook is useful for ShareGPT ChatML / Vicuna templates.
This text completion notebook is for raw text. This DPO notebook replicates Zephyr.
* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster.

Downloads last month: 29,662

Safetensors

Model size

37.4B params

Tensor type

F32

·

BF16

·

U8

·

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit

Base model

meta-llama/Llama-3.1-70B

Finetuned

meta-llama/Llama-3.1-70B-Instruct

Quantized

(95)

this model

Adapters

Finetunes

Quantizations

Space using unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit 1

Collections including unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit

4bit Instruct Models

18 items • Updated 1 day ago • 26

Llama 3.1 Collection

Meta's Llama 3.1 models including 8B, 70B, 405B. Includes 4-bit bnb and original versions. • 10 items • Updated 1 day ago • 1