gemma-2-2b-it-GGUF / README.md
alanzhuly's picture
Update README.md
518c5ae verified
metadata
license: gemma
library_name: transformers
pipeline_tag: text-generation
extra_gated_heading: Access Gemma on Hugging Face
extra_gated_prompt: >-
  To access Gemma on Hugging Face, you’re required to review and agree to
  Google’s usage license. To do this, please ensure you’re logged in to Hugging
  Face and click below. Requests are processed immediately.
extra_gated_button_content: Acknowledge license
tags:
  - conversational
quantized_by: Davidqian123
base_model: google/gemma-2-2b-it

Gemma-2-2b-Instruct-GGUF Introduction

Gemma 2 Instruct is the latest addition to Google's Gemma family of lightweight, state-of-the-art open models. Built on Gemini technology, this 2 billion parameter model excels at various text generation tasks while being compact enough for edge and low-power computing environments.

Key Features

  • Based on Gemini technology
  • 2 billion parameters
  • Trained on 2 trillion tokens of web documents, code, and mathematics
  • Suitable for edge devices and low-power compute
  • Versatile for text generation, coding, and mathematical tasks
  • Retains the large vocabulary from Gemma 1.1 for enhanced multilingual and coding capabilities

Applications

Gemma 2 Instruct is designed for a wide range of applications, including:

  • Content creation
  • Chatbots and conversational AI
  • Text summarization
  • Code generation
  • Mathematical problem-solving

For more details check out their blog post here: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma/

Quantized GGUF Models Benchmark

Name Quant method Bits Size Use Cases
gemma-2-2b-it-Q2_K.gguf Q2_K 2 1.23 GB fast but high loss, not recommended
gemma-2-2b-it-Q3_K_S.gguf Q3_K_S 3 1.36 GB extremely not recommended
gemma-2-2b-it-Q3_K_M.gguf Q3_K_M 3 1.46 GB moderate loss, not very recommended
gemma-2-2b-it-Q3_K_L.gguf Q3_K_L 3 1.55 GB not very recommended
gemma-2-2b-it-Q4_0.gguf Q4_0 4 1.63 GB moderate speed, recommended
gemma-2-2b-it-Q4_1.gguf Q4_1 4 1.76 GB moderate speed, recommended
gemma-2-2b-it-Q4_K_S.gguf Q4_K_S 4 1.64 GB fast and accurate, very recommended
gemma-2-2b-it-Q4_K_M.gguf Q4_K_M 4 1.71 GB fast, recommended
gemma-2-2b-it-Q5_0.gguf Q5_0 5 1.88 GB fast, recommended
gemma-2-2b-it-Q5_1.gguf Q5_1 5 2.01 GB very big, prefer Q4
gemma-2-2b-it-Q5_K_S.gguf Q5_K_S 5 1.88 GB big, recommended
gemma-2-2b-it-Q5_K_M.gguf Q5_K_M 5 1.92 GB big, recommended
gemma-2-2b-it-Q6_K.gguf Q6_K 6 2.15 GB very big, not very recommended
gemma-2-2b-it-Q8_0.gguf Q8_0 8 2.78 GB very big, not very recommended
gemma-2-2b-it-F16.gguf F16 16 5.24 GB extremely big

Quantized with llama.cpp

Invitation to join our beta test to accelerate your on-device AI development

Sign up using this Link: https://forms.gle/vuoktPjPmotnT4sM7

We're excited to invite you to join our beta test for a new platform designed to enhance on-device AI development.

By participating, you'll have the opportunity to connect with fellow developers and researchers and contribute to the open-source future of on-device AI.

Here's how you can get involved:

  1. Sign an NDA.
  2. Receive a link to our beta testing community on Discord.
  3. Join a brief 15-minute online chat to share your valuable feedback.

Your insights are invaluable to us as we build this platform together.