Very excited to share the first two official Gemma variants from Google! Today at Google Cloud Next, we announced cutting-edge models for code and research!
Second, (google/recurrentgemma-release-66152cbdd2d6619cb1665b7a), which is based on the outstanding Google DeepMind research in Griffin: https://arxiv.org/abs/2402.19427. RecurrentGemma is a research variant that enables higher throughput and vastly improved memory usage. We are excited about new architectures, especially in the lightweight Gemma sizes, where innovations like RecurrentGemma can scale modern AI to many more use cases.
For details on the launches of these models, check out our launch blog -- and please do not hesitate to send us feedback. We are excited to see what you build with CodeGemma and RecurrentGemma!
Huge thanks to the Hugging Face team for helping ensure that these models work flawlessly in the Hugging Face ecosystem at launch!
New base pretrained models on the Open LLM Leaderboard!
Two new OSS models by Google, who's getting back in the game ๐ The 7B is 2nd of the leaderboard, and better than Mistral (notably on GSM8K, aka math).
I am thrilled to announce Gemma, new 2B and 7B models from Google, based on the same research and technology used to train the Gemini models! These models achieve state-of-the-art performance for their size, and are launched across Transformers, Google Cloud, and many other surfaces worldwide starting today.
These launches are the product of an outstanding collaboration between the Google DeepMind and Hugging Face teams over the last few months -- very proud of the work both teams have done, from integration with Vertex AI to optimization across the stack. Read more about the partnership in the main launch by @philschmid@osanseviero@pcuenq on the launch blog: https://huggingface.co./blog/gemma
More information below if you are curious about training details, eval results, and safety characteristics!