Acrux-500M-o1-Journey Model Files

The Acrux-500M-o1-Journey is a lightweight, instruction-tuned language model fine-tuned from the Qwen2.5-0.5B-Instruct base model. With a size of 500 million parameters, it is designed for cost-effective deployment and fast text generation while maintaining quality performance for instruction-following tasks.

File Name Size Description Upload Status
.gitattributes 1.57 kB Git attributes for managing LFS files. Uploaded
README.md 195 Bytes Model overview or documentation. Updated
added_tokens.json 657 Bytes Custom tokens for the tokenizer. Uploaded
config.json 859 Bytes Model configuration file. Uploaded
generation_config.json 280 Bytes Configuration for text generation. Uploaded
merges.txt 1.82 MB Merge rules for byte-pair encoding (BPE). Uploaded
pytorch_model.bin 988 MB Model weights (PyTorch format). Uploaded (LFS)
special_tokens_map.json 644 Bytes Mapping for special tokens. Uploaded
tokenizer.json 11.4 MB Full tokenizer configuration. Uploaded (LFS)
tokenizer_config.json 7.73 kB Additional tokenizer settings. Uploaded
vocab.json 2.78 MB Vocabulary for the tokenizer. Uploaded

Key Features:

  1. Compact Size with Efficient Performance:
    The smaller parameter count (500M) ensures faster inference and reduced hardware requirements.

  2. Instruction Optimization:
    Fine-tuned to follow prompts effectively, making it suitable for interactive applications and prompt-based tasks.

  3. Domain-Specific Training:
    Trained on the GAIR/o1-journey dataset, providing tailored capabilities for specific use cases.


Training Details:


Capabilities:

  1. Instruction Following:

    • Generates accurate and coherent responses to user instructions.
    • Handles summarization, question-answering, and conversational tasks.
  2. Fast Inference:

    • Ideal for real-time applications due to reduced latency from its smaller size.
  3. Interactive AI Development:

    • Suitable for chatbots, virtual assistants, and instructional interfaces.

Usage Instructions:

  1. Setup:
    Download all model files, ensuring compatibility with the Hugging Face Transformers library.

  2. Loading the Model:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_name = "prithivMLmods/Acrux-500M-o1-Journey"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name)
    
  3. Sample Generate Text:

    input_text = "Explain the concept of machine learning in simple terms."
    inputs = tokenizer(input_text, return_tensors="pt")
    outputs = model.generate(**inputs, max_length=100, temperature=0.7)
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
    
  4. Optimize Generation:
    Adjust parameters in generation_config.json for better control of output, such as:

    • temperature for randomness.
    • top_p for sampling diversity.
    • max_length for output size.

Downloads last month
269
Safetensors
Model size
494M params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for prithivMLmods/Acrux-500M-o1-Journey

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(97)
this model
Quantizations
4 models

Dataset used to train prithivMLmods/Acrux-500M-o1-Journey

Collection including prithivMLmods/Acrux-500M-o1-Journey