duyntnet commited on
Commit
e478825
·
verified ·
1 Parent(s): c48f917

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +65 -0
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ inference: false
7
+ tags:
8
+ - transformers
9
+ - gguf
10
+ - imatrix
11
+ - INTELLECT-1-Instruct
12
+ ---
13
+ Quantizations of https://huggingface.co/PrimeIntellect/INTELLECT-1-Instruct
14
+
15
+ ### Inference Clients/UIs
16
+ * [llama.cpp](https://github.com/ggerganov/llama.cpp)
17
+ * [KoboldCPP](https://github.com/LostRuins/koboldcpp)
18
+ * [ollama](https://github.com/ollama/ollama)
19
+ * [jan](https://github.com/janhq/jan)
20
+ * [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
21
+ * [GPT4All](https://github.com/nomic-ai/gpt4all)
22
+ ---
23
+
24
+ # From original readme
25
+
26
+ **INTELLECT-1** is the first collaboratively trained 10 billion parameter language model trained from scratch on 1 trillion tokens of English text and code.
27
+
28
+ This is an instruct model. The base model associated with it is [INTELLECT-1](https://huggingface.co/PrimeIntellect/INTELLECT-1).
29
+
30
+
31
+ **INTELLECT-1** was trained on up to 14 concurrent nodes distributed across 3 continents, with contributions from 30 independent community contributors providing compute.
32
+ The training code utilizes the [prime framework](https://github.com/PrimeIntellect-ai/prime), a scalable distributed training framework designed for fault-tolerant, dynamically scaling, high-perfomance training on unreliable, globally distributed workers.
33
+ The key abstraction that allows dynamic scaling is the `ElasticDeviceMesh` which manages dynamic global process groups for fault-tolerant communication across the internet and local process groups for communication within a node.
34
+ The model was trained using the [DiLoCo](https://arxiv.org/abs/2311.08105) algorithms with 100 inner steps. The global all-reduce was done with custom int8 all-reduce kernels to reduce the communication payload required, greatly reducing the communication overhead by a factor 400x.
35
+
36
+ For more detailed technical insights, please refer to our [technical paper](https://github.com/PrimeIntellect-ai/prime).
37
+
38
+ **Note: You must add a BOS token at the beginning of each sample. Performance may be impacted otherwise.**
39
+
40
+ ## Usage
41
+ ```python
42
+ import torch
43
+ from transformers import AutoModelForCausalLM, AutoTokenizer
44
+
45
+ torch.set_default_device("cuda")
46
+ model = AutoModelForCausalLM.from_pretrained("PrimeIntellect/INTELLECT-1-Instruct")
47
+ tokenizer = AutoTokenizer.from_pretrained("PrimeIntellect/INTELLECT-1-Instruct")
48
+
49
+ input_text = "What is the Metamorphosis of Prime Intellect about?"
50
+ input_ids = tokenizer.encode(input_text, return_tensors="pt")
51
+ output_ids = model.generate(input_ids, max_length=50, num_return_sequences=1)
52
+ output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
53
+
54
+ print(output_text)
55
+ ```
56
+
57
+ ### Example text generation pipeline
58
+ ```python
59
+ import torch
60
+ from transformers import pipeline
61
+ torch.set_default_device("cuda")
62
+
63
+ pipe = pipeline("text-generation", model="PrimeIntellect/INTELLECT-1")
64
+ print(pipe("What is prime intellect ?"))
65
+ ```