cloudyu commited on
Commit
59ef953
·
1 Parent(s): 42db6b2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ ---
4
+
5
+ # Mixtral MOE 2x7B
6
+
7
+
8
+
9
+ MOE the following models by mergekit:
10
+
11
+ * [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
12
+ * [NurtureAI/neural-chat-7b-v3-16k](https://huggingface.co/NurtureAI/neural-chat-7b-v3-16k)
13
+ * [meta-math/jondurbin/bagel-dpo-7b-v0.1](https://huggingface.co/jondurbin/bagel-dpo-7b-v0.1)
14
+
15
+
16
+
17
+ Works and generates coherent text.
18
+
19
+ gpu code example
20
+
21
+ ```
22
+ import torch
23
+ from transformers import AutoTokenizer, AutoModelForCausalLM
24
+ import math
25
+
26
+ ## v2 models
27
+ model_path = "cloudyu/Mixtral_7Bx2_MoE"
28
+
29
+ tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
30
+ model = AutoModelForCausalLM.from_pretrained(
31
+ model_path, torch_dtype=torch.float32, device_map='auto',local_files_only=False, load_in_4bit=True
32
+ )
33
+ print(model)
34
+ prompt = input("please input prompt:")
35
+ while len(prompt) > 0:
36
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda")
37
+
38
+ generation_output = model.generate(
39
+ input_ids=input_ids, max_new_tokens=500,repetition_penalty=1.2
40
+ )
41
+ print(tokenizer.decode(generation_output[0]))
42
+ prompt = input("please input prompt:")
43
+ ```
44
+
45
+ CPU example
46
+
47
+ ```
48
+ import torch
49
+ from transformers import AutoTokenizer, AutoModelForCausalLM
50
+ import math
51
+
52
+ ## v2 models
53
+ model_path = "cloudyu/Mixtral_7Bx2_MoE"
54
+
55
+ tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
56
+ model = AutoModelForCausalLM.from_pretrained(
57
+ model_path, torch_dtype=torch.float32, device_map='cpu',local_files_only=False
58
+ )
59
+ print(model)
60
+ prompt = input("please input prompt:")
61
+ while len(prompt) > 0:
62
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
63
+
64
+ generation_output = model.generate(
65
+ input_ids=input_ids, max_new_tokens=500,repetition_penalty=1.2
66
+ )
67
+ print(tokenizer.decode(generation_output[0]))
68
+ prompt = input("please input prompt:")
69
+
70
+ ```