duyntnet commited on
Commit
c5ff2df
1 Parent(s): 0d031ec

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -0
README.md ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ inference: false
7
+ tags:
8
+ - transformers
9
+ - gguf
10
+ - imatrix
11
+ - Sailor2-20B-Chat
12
+ ---
13
+ Quantizations of https://huggingface.co/sail/Sailor2-20B-Chat
14
+
15
+ ### Inference Clients/UIs
16
+ * [llama.cpp](https://github.com/ggerganov/llama.cpp)
17
+ * [KoboldCPP](https://github.com/LostRuins/koboldcpp)
18
+ * [ollama](https://github.com/ollama/ollama)
19
+ * [jan](https://github.com/janhq/jan)
20
+ * [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
21
+ * [GPT4All](https://github.com/nomic-ai/gpt4all)
22
+ ---
23
+
24
+ # From original readme
25
+
26
+ Sailor2 is a community-driven initiative that brings cutting-edge multilingual language models to South-East Asia (SEA).
27
+ Our research highlights a strong demand for models in the **8B and 20B parameter** range for production use, alongside **1B models** for specialized applications,
28
+ such as speculative decoding and research purposes.
29
+ These models, released under the **Apache 2.0 license**, provide enhanced accessibility to advanced language technologies across the region.
30
+
31
+ Sailor2 builds upon the foundation of the awesome multilingual model [Qwen 2.5](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e) and
32
+ is continuously pre-trained on **500B tokens** to support **15 languages** better with a unified model.
33
+ These languages include English, Chinese, Burmese, Cebuano, Ilocano, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tagalog, Thai, Vietnamese, and Waray.
34
+ By addressing the growing demand for diverse, robust, and accessible language models, Sailor2 seeks to serve the underserved in SEA areas with open, inclusive, and accessible multilingual LLMs.
35
+ The Sailor2 model comes in three sizes, 1B, 8B, and 20B, which are **expanded from the Qwen2.5 base models** of 0.5B, 7B, and 14B, respectively.
36
+
37
+ ## Requirements
38
+ The code of Sailor2 has been in the latest Hugging face transformers and we advise you to install `transformers==4.46.3`.
39
+
40
+ ## Quickstart
41
+
42
+ Here provides a code snippet to show you how to load the tokenizer and model and how to generate contents.
43
+
44
+ ```python
45
+ import torch
46
+ from transformers import AutoModelForCausalLM, AutoTokenizer
47
+ device = "cuda"
48
+
49
+ model = AutoModelForCausalLM.from_pretrained(
50
+ 'sail/Sailor2-20B-Chat',
51
+ torch_dtype=torch.bfloat16,
52
+ device_map="auto"
53
+ )
54
+
55
+ tokenizer = AutoTokenizer.from_pretrained('sail/Sailor2-20B-Chat')
56
+ system_prompt= \
57
+ 'You are an AI assistant named Sailor2, created by Sea AI Lab. \
58
+ As an AI assistant, you can answer questions in English, Chinese, and Southeast Asian languages \
59
+ such as Burmese, Cebuano, Ilocano, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tagalog, Thai, Vietnamese, and Waray. \
60
+ Your responses should be friendly, unbiased, informative, detailed, and faithful.'
61
+
62
+ prompt = "Beri saya pengenalan singkat tentang model bahasa besar."
63
+ # prompt = "Hãy cho tôi một giới thiệu ngắn gọn về mô hình ngôn ngữ lớn."
64
+ # prompt = "ให้ฉันแนะนำสั้น ๆ เกี่ยวกับโมเดลภาษาขนาดใหญ่"
65
+
66
+ messages = [
67
+ {"role": "system", "content": system_prompt},
68
+ {"role": "user", "content": prompt}
69
+ ]
70
+ text = tokenizer.apply_chat_template(
71
+ messages,
72
+ tokenize=False,
73
+ add_generation_prompt=True
74
+ )
75
+
76
+ model_inputs = tokenizer([text], return_tensors="pt").to(device)
77
+ input_ids = model_inputs.input_ids.to(device)
78
+
79
+ generated_ids = model.generate(
80
+ input_ids,
81
+ max_new_tokens=512,
82
+ )
83
+
84
+ generated_ids = [
85
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
86
+ ]
87
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
88
+ print(response)
89
+ ```