Model Card for Model ID
How to Use
!pip install bitsandbytes
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# **Model Name on Hugging Face**
MODEL_NAME = "Vijayendra/DeepSeek-Llama3.1-8B-DeepThinker-v1"
# 🛠 **Load Model & Tokenizer from Hugging Face**
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(
MODEL_NAME,
device_map="auto", # Automatically assigns model layers to available GPUs/CPUs
torch_dtype=torch.float16 # Use 16-bit precision for memory efficiency
).to("cuda" if torch.cuda.is_available() else "cpu") # Send model to GPU if available
# 🛠 **Define Inference Function**
def generate_response(model, tokenizer, prompt, max_new_tokens=2048, temperature=0.7):
# Tokenize input
inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True).to(model.device)
# Ensure attention mask is passed
attention_mask = inputs.attention_mask
# Generate response
with torch.no_grad():
generated_tokens = model.generate(
inputs.input_ids,
attention_mask=inputs.attention_mask, # Ensure attention mask is passed
max_new_tokens=max_new_tokens,
temperature=temperature,
do_sample=True,
top_k=40,
top_p=0.9,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id
)
# Decode response
return tokenizer.decode(generated_tokens[0], skip_special_tokens=True)
# **Test Questions**
questions = [
"The sun orbits the Earth once every 365 days. Is this true?",
"Write a brief summary about the impact of World War I and World War II on human history, ensuring that you do not hallucinate numbers or dates.",
"Explain in detail how a nuclear reactor works, including the roles of moderation, control rods, and coolant, without resorting to overly generic explanations.",
"Analyze the ethical implications of using AI in decision-making within the criminal justice system, highlighting both potential benefits and risks."
]
# **Generate and Print Responses**
for i, question in enumerate(questions, 1):
response = generate_response(model, tokenizer, question)
print(f"\n🟢 Question {i}: {question}")
print(f"🔵 Response: {response}")
Framework versions
- PEFT 0.14.0
- Downloads last month
- 73
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for Vijayendra/DeepSeek-Llama3.1-8B-DeepThinker-v1
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B