duyntnet
/

INTELLECT-1-Instruct-imatrix-GGUF

+---
+license: other
+language:
+- en
+pipeline_tag: text-generation
+inference: false
+tags:
+- transformers
+- gguf
+- imatrix
+- INTELLECT-1-Instruct
+---
+Quantizations of https://huggingface.co/PrimeIntellect/INTELLECT-1-Instruct
+### Inference Clients/UIs
+* [llama.cpp](https://github.com/ggerganov/llama.cpp)
+* [KoboldCPP](https://github.com/LostRuins/koboldcpp)
+* [ollama](https://github.com/ollama/ollama)
+* [jan](https://github.com/janhq/jan)
+* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
+* [GPT4All](https://github.com/nomic-ai/gpt4all)
+---
+# From original readme
+**INTELLECT-1** is the first collaboratively trained 10 billion parameter language model trained from scratch on 1 trillion tokens of English text and code.
+This is an instruct model. The base model associated with it is [INTELLECT-1](https://huggingface.co/PrimeIntellect/INTELLECT-1).
+**INTELLECT-1** was trained on up to 14 concurrent nodes distributed across 3 continents, with contributions from 30 independent community contributors providing compute.
+The training code utilizes the [prime framework](https://github.com/PrimeIntellect-ai/prime), a scalable distributed training framework designed for fault-tolerant, dynamically scaling, high-perfomance training on unreliable, globally distributed workers.
+The key abstraction that allows dynamic scaling is the `ElasticDeviceMesh` which manages dynamic global process groups for fault-tolerant communication across the internet and local process groups for communication within a node.
+The model was trained using the [DiLoCo](https://arxiv.org/abs/2311.08105) algorithms with 100 inner steps. The global all-reduce was done with custom int8 all-reduce kernels to reduce the communication payload required, greatly reducing the communication overhead by a factor 400x.
+For more detailed technical insights, please refer to our [technical paper](https://github.com/PrimeIntellect-ai/prime).
+**Note: You must add a BOS token at the beginning of each sample. Performance may be impacted otherwise.**
+## Usage
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+torch.set_default_device("cuda")
+model = AutoModelForCausalLM.from_pretrained("PrimeIntellect/INTELLECT-1-Instruct")
+tokenizer = AutoTokenizer.from_pretrained("PrimeIntellect/INTELLECT-1-Instruct")
+input_text = "What is the Metamorphosis of Prime Intellect about?"
+input_ids = tokenizer.encode(input_text, return_tensors="pt")
+output_ids = model.generate(input_ids, max_length=50, num_return_sequences=1)
+output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
+print(output_text)
+```
+### Example text generation pipeline
+```python
+import torch
+from transformers import pipeline
+torch.set_default_device("cuda")
+pipe = pipeline("text-generation", model="PrimeIntellect/INTELLECT-1")
+print(pipe("What is prime intellect ?"))
+```