Triangle104/AwA-1.5B-Q5_K_M-GGUF
This model was converted to GGUF format from Spestly/AwA-1.5B
using llama.cpp via the ggml.ai's GGUF-my-repo space.
Refer to the original model card for more details on the model.
Model details:
AwA (Answers with Athena) is my portfolio project, showcasing a cutting-edge Chain-of-Thought (CoT) reasoning model. I created AwA to excel in providing detailed, step-by-step answers to complex questions across diverse domains. This model represents my dedication to advancing AI’s capability for enhanced comprehension, problem-solving, and knowledge synthesis.
Key Features
Chain-of-Thought Reasoning: AwA delivers step-by-step breakdowns of solutions, mimicking logical human thought processes.
Domain Versatility: Performs exceptionally across a wide range of domains, including mathematics, science, literature, and more.
Adaptive Responses: Adjusts answer depth and complexity based on input queries, catering to both novices and experts.
Interactive Design: Designed for educational tools, research assistants, and decision-making systems.
Intended Use Cases
Educational Applications: Supports learning by breaking down complex problems into manageable steps.
Research Assistance: Generates structured insights and explanations in academic or professional research.
Decision Support: Enhances understanding in business, engineering, and scientific contexts.
General Inquiry: Provides coherent, in-depth answers to everyday questions.
Type: Chain-of-Thought (CoT) Reasoning Model
Base Architecture: Adapted from [qwen2]
Parameters: [1.54B]
Fine-tuning: Specialized fine-tuning on Chain-of-Thought reasoning datasets to enhance step-by-step explanatory capabilities.
Ethical Considerations
Bias Mitigation: I have taken steps to minimise biases in the training data. However, users are encouraged to cross-verify outputs in sensitive contexts.
Limitations: May not provide exhaustive answers for niche topics or domains outside its training scope.
User Responsibility: Designed as an assistive tool, not a replacement for expert human judgment.
Usage
Option A: Local
Using locally with the Transformers library
Use a pipeline as a high-level helper
from transformers import pipeline
messages = [ {"role": "user", "content": "Who are you?"}, ] pipe = pipeline("text-generation", model="Spestly/AwA-1.5B") pipe(messages)
Option B: API & Space
You can use the AwA HuggingFace space or the AwA API (Coming soon!)
Roadmap
More AwA model sizes e.g 7B and 14B
Create AwA API via spestly package
Use with llama.cpp
Install llama.cpp through brew (works on Mac and Linux)
brew install llama.cpp
Invoke the llama.cpp server or the CLI.
CLI:
llama-cli --hf-repo Triangle104/AwA-1.5B-Q5_K_M-GGUF --hf-file awa-1.5b-q5_k_m.gguf -p "The meaning to life and the universe is"
Server:
llama-server --hf-repo Triangle104/AwA-1.5B-Q5_K_M-GGUF --hf-file awa-1.5b-q5_k_m.gguf -c 2048
Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.
Step 1: Clone llama.cpp from GitHub.
git clone https://github.com/ggerganov/llama.cpp
Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1
flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
cd llama.cpp && LLAMA_CURL=1 make
Step 3: Run inference through the main binary.
./llama-cli --hf-repo Triangle104/AwA-1.5B-Q5_K_M-GGUF --hf-file awa-1.5b-q5_k_m.gguf -p "The meaning to life and the universe is"
or
./llama-server --hf-repo Triangle104/AwA-1.5B-Q5_K_M-GGUF --hf-file awa-1.5b-q5_k_m.gguf -c 2048
- Downloads last month
- 24