nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit

Nidum-Llama-3.2-3B-Uncensored-MLX-4bit

Welcome to Nidum!

At Nidum, we are committed to delivering cutting-edge AI models that offer advanced capabilities and unrestricted access to innovation. With Nidum-Llama-3.2-3B-Uncensored-MLX-4bit, we bring you a performance-optimized, space-efficient, and feature-rich model designed for diverse use cases.

Explore Nidum's Open-Source Projects on GitHub: https://github.com/NidumAI-Inc

Key Features

Compact and Efficient: Built in the MLX-4bit format for optimized performance with minimal memory usage.
Versatility: Excels in a wide range of tasks, including technical problem-solving, educational queries, and casual conversations.
Extended Context Handling: Capable of maintaining coherence in long-context interactions.
Seamless Integration: Enhanced compatibility with the mlx-lm library for a streamlined development experience.
Uncensored Access: Provides uninhibited responses across a variety of topics and applications.

How to Use

To utilize Nidum-Llama-3.2-3B-Uncensored-MLX-4bit, install the mlx-lm library and follow the example code below:

Installation

pip install mlx-lm

Usage

from mlx_lm import load, generate

# Load the model and tokenizer
model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit")

# Create a prompt
prompt = "hello"

# Apply the chat template if available
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )

# Generate the response
response = generate(model, tokenizer, prompt=prompt, verbose=True)

# Print the response
print(response)

About the Model

The nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit model was converted to MLX format from nidum/Nidum-Llama-3.2-3B-Uncensored using mlx-lm version 0.19.2, providing the following benefits:

Smaller Memory Footprint: Ideal for environments with limited hardware resources.
High Performance: Retains the advanced capabilities of the original model while optimizing inference speed and efficiency.
Plug-and-Play Compatibility: Easily integrate with the mlx-lm ecosystem for seamless deployment.

Use Cases

Technical Problem Solving
Research and Educational Assistance
Open-Ended Q&A
Creative Writing and Ideation
Long-Context Dialogues
Unrestricted Knowledge Exploration

Datasets and Fine-Tuning

The model inherits the fine-tuned capabilities of its predecessor, Nidum-Llama-3.2-3B-Uncensored, including:

Uncensored Data: Ensures detailed and uninhibited responses.
RAG-Based Fine-Tuning: Optimizes retrieval-augmented generation for information-intensive tasks.
Math-Instruct Data: Tailored for precise mathematical reasoning.
Long-Context Fine-Tuning: Enhanced coherence and relevance in extended interactions.

Quantized Model Download

The MLX-4bit version is highly efficient, maintaining a balance between precision and memory usage.

Benchmark

Benchmark	Metric	LLaMA 3B	Nidum 3B	Observation
GPQA	Exact Match (Flexible)	0.3	0.5	Nidum 3B demonstrates significant improvement, particularly in generative tasks.
	Accuracy	0.4	0.5	Consistent improvement, especially in zero-shot scenarios.
HellaSwag	Accuracy	0.3	0.4	Better performance in common sense reasoning tasks.
	Normalized Accuracy	0.3	0.4	Enhanced ability to understand and predict context in sentence completion.
	Normalized Accuracy (Stderr)	0.15275	0.1633	Slightly improved consistency in normalized accuracy.
	Accuracy (Stderr)	0.15275	0.1633	Shows robustness in reasoning accuracy compared to LLaMA 3B.

Insights:

Compact Efficiency: The MLX-4bit model ensures high performance with reduced resource usage.
Enhanced Usability: Optimized for seamless integration with lightweight deployment scenarios.

Contributing

We invite contributions to further enhance the MLX-4bit model's capabilities. Reach out to us for collaboration opportunities.

Contact

For inquiries, support, or feedback, email us at [email protected].

Explore the Future

Embrace the power of innovation with Nidum-Llama-3.2-3B-Uncensored-MLX-4bit—the ideal blend of performance and efficiency.

nidum
/

Nidum-Llama-3.2-3B-Uncensored-MLX-4bit