|
--- |
|
license: other |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
inference: false |
|
tags: |
|
- transformers |
|
- gguf |
|
- imatrix |
|
- Deepthink-Reasoning-7B |
|
--- |
|
Quantizations of https://huggingface.co./prithivMLmods/Deepthink-Reasoning-7B |
|
|
|
### Inference Clients/UIs |
|
* [llama.cpp](https://github.com/ggerganov/llama.cpp) |
|
* [KoboldCPP](https://github.com/LostRuins/koboldcpp) |
|
* [ollama](https://github.com/ollama/ollama) |
|
* [jan](https://github.com/janhq/jan) |
|
* [text-generation-webui](https://github.com/oobabooga/text-generation-webui) |
|
* [GPT4All](https://github.com/nomic-ai/gpt4all) |
|
--- |
|
|
|
# From original readme |
|
|
|
The **Deepthink-Reasoning-7B** is a fine-tuned version of the **Qwen2.5-7B-Instruct** base model, designed for text generation tasks that require deep reasoning, logical structuring, and problem-solving. This model leverages its optimized architecture to provide accurate and contextually relevant outputs for complex queries, making it ideal for applications in education, programming, and creative writing. |
|
|
|
With its robust natural language processing capabilities, **Deepthink-Reasoning-7B** excels in generating step-by-step solutions, creative content, and logical analyses. Its architecture integrates advanced understanding of both structured and unstructured data, ensuring precise text generation aligned with user inputs. |
|
|
|
- Significantly **more knowledge** and has greatly improved capabilities in **coding** and **mathematics**, thanks to our specialized expert models in these domains. |
|
- Significant improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g, tables), and **generating structured outputs** especially JSON. **More resilient to the diversity of system prompts**, enhancing role-play implementation and condition-setting for chatbots. |
|
- **Long-context Support** up to 128K tokens and can generate up to 8K tokens. |
|
- **Multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. |
|
|
|
# **Demo Start** |
|
|
|
Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents. |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name = "prithivMLmods/Deepthink-Reasoning-7B" |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype="auto", |
|
device_map="auto" |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
|
prompt = "Give me a short introduction to large language model." |
|
messages = [ |
|
{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."}, |
|
{"role": "user", "content": prompt} |
|
] |
|
text = tokenizer.apply_chat_template( |
|
messages, |
|
tokenize=False, |
|
add_generation_prompt=True |
|
) |
|
model_inputs = tokenizer([text], return_tensors="pt").to(model.device) |
|
|
|
generated_ids = model.generate( |
|
**model_inputs, |
|
max_new_tokens=512 |
|
) |
|
generated_ids = [ |
|
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) |
|
] |
|
|
|
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] |
|
``` |