Nidum-Llama-3.2-3B-Uncensored-MLX-8bit
Welcome to Nidum!
At Nidum, our mission is to bring cutting-edge AI capabilities to everyone with unrestricted access to innovation. With Nidum-Llama-3.2-3B-Uncensored-MLX-8bit, you get an optimized, efficient, and versatile AI model for diverse applications.
Discover Nidum's Open-Source Projects on GitHub: https://github.com/NidumAI-Inc
Key Features
- Efficient and Compact: Developed in MLX-8bit format for improved performance and reduced memory demands.
- Wide Applicability: Suitable for technical problem-solving, educational content, and conversational tasks.
- Advanced Context Awareness: Handles long-context conversations with exceptional coherence.
- Streamlined Integration: Optimized for use with the mlx-lm library for effortless development.
- Unrestricted Responses: Offers uncensored answers across all supported domains.
How to Use
To use Nidum-Llama-3.2-3B-Uncensored-MLX-8bit, install the mlx-lm library and follow these steps:
Installation
pip install mlx-lm
Usage
from mlx_lm import load, generate
# Load the model and tokenizer
model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit")
# Create a prompt
prompt = "hello"
# Apply the chat template if available
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
# Generate the response
response = generate(model, tokenizer, prompt=prompt, verbose=True)
# Print the response
print(response)
About the Model
The nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit model, converted using mlx-lm version 0.19.2, brings:
- Memory Efficiency: Tailored for systems with limited hardware.
- Performance Optimization: Matches the capabilities of the original model while delivering faster inference.
- Plug-and-Play: Easily integrates with the mlx-lm library for deployment ease.
Use Cases
- Problem Solving in Tech and Science
- Educational and Research Assistance
- Creative Writing and Brainstorming
- Extended Dialogues
- Uninhibited Knowledge Exploration
Datasets and Fine-Tuning
Derived from Nidum-Llama-3.2-3B-Uncensored, the MLX-8bit version inherits:
- Uncensored Fine-Tuning: Delivers detailed and open-ended responses.
- RAG-Based Optimization: Enhances retrieval-augmented generation for data-driven tasks.
- Math Reasoning Support: Precise mathematical computations and explanations.
- Long-Context Training: Ensures relevance and coherence in extended conversations.
Quantized Model Download
The MLX-8bit format strikes the perfect balance between memory optimization and performance.
Benchmark
Benchmark | Metric | LLaMA 3B | Nidum 3B | Observation |
---|---|---|---|---|
GPQA | Exact Match (Flexible) | 0.3 | 0.5 | Nidum 3B achieves notable improvement in generative tasks. |
Accuracy | 0.4 | 0.5 | Demonstrates strong performance, especially in zero-shot tasks. | |
HellaSwag | Accuracy | 0.3 | 0.4 | Excels in common-sense reasoning tasks. |
Normalized Accuracy | 0.3 | 0.4 | Strong contextual understanding in sentence completion tasks. | |
Normalized Accuracy (Stderr) | 0.15275 | 0.1633 | Enhanced consistency in normalized accuracy. | |
Accuracy (Stderr) | 0.15275 | 0.1633 | Demonstrates robustness in reasoning accuracy compared to LLaMA 3B. |
Insights
- High Performance, Low Resource: The MLX-8bit format is ideal for environments with limited memory and processing power.
- Seamless Integration: Designed for smooth integration into lightweight systems and workflows.
Contributing
Join us in enhancing the MLX-8bit model's capabilities. Contact us for collaboration opportunities.
Contact
For questions, support, or feedback, email [email protected].
Experience the Future
Harness the power of Nidum-Llama-3.2-3B-Uncensored-MLX-8bit for a perfect blend of performance and efficiency.
- Downloads last month
- 0
Model tree for nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit
Base model
meta-llama/Llama-3.2-3B-Instruct