Model Card for LLaMA-2-7B-NIEXCHE

This model was fine-tuned from LLaMA-2-7B on a Turkish agriculture QA dataset. It supports both Turkish and English languages and was trained for use in agriculture-related natural language processing (NLP) tasks.

Model Details

Model Description

Model Sources

Uses

Direct Use

The model can be used directly for question-answering tasks related to agriculture in Turkish and English. It is fine-tuned specifically for agricultural Q&A, making it suitable for similar domains and use cases.

Out-of-Scope Use

The model might not perform well on general knowledge questions outside of the agriculture domain.

Training Details

Training Data

The training data was a custom dataset created by translating and cleaning agricultural QA data from this source. The dataset contains 22.6k question-answer pairs in Turkish.

Training Procedure

The model was trained using the following frameworks and libraries:

  • Frameworks: PyTorch, transformers, accelerate==0.21.0, peft==0.4.0, bitsandbytes==0.40.2, trl==0.4.7
  • Precision: The model was trained using 4-bit quantization (BNB) with mixed precision (float16) to optimize memory usage.

Training Hyperparameters

  • Base Model: meta-llama/Llama-2-7b
  • Batch Size: 4 (per device)
  • Learning Rate: 2e-4
  • LoRA Parameters:
    • lora_r = 64
    • lora_alpha = 16
    • lora_dropout = 0.1
  • Epochs: 1
  • Optimizer: Paged AdamW (32-bit)
  • Gradient Accumulation Steps: 1
  • Scheduler: Cosine
  • Max Gradient Norm: 0.3
  • Gradient Checkpointing: Enabled
  • Warmup Ratio: 0.03
  • Group by Length: Enabled
  • Max Sequence Length: None

Hardware

  • Training Hardware: Google Colab Pro (A100 GPU) and 53 GB system RAM.
  • Training Time: Approximately 1 hour 40 minutes.

Training output: TrainOutput(global_step=5654, training_loss=0.7829279924898043, metrics={'train_runtime': 6029.996, 'train_samples_per_second': 3.75, 'train_steps_per_second': 0.938, 'total_flos': 5.516196145999872e+16, 'train_loss': 0.7829279924898043, 'epoch': 1.0})

Evaluation

The same dataset (NIEXCHE/turkish_agriculture_QA_llama2_22.6k) was used for evaluation purposes.

Environmental Impact

Carbon emissions were estimated using the Machine Learning Impact calculator.

  • Hardware Type: Google Colab (A100 GPU)
  • Hours used: 1 hour 40 minutes
  • Compute Region: Google Cloud (Colab)
  • Carbon Emitted: Estimations pending

Citation

If you use this model in your research or applications, please cite it as:

@misc{Fevzi2024LLaMA-2-7B-NIEXCHE,
  author = {Fevzi KILAS},
  title = {LLaMA-2-7B-NIEXCHE: A Turkish Agriculture QA Model},
  year = {2024},
  howpublished = {https://huggingface.co./NIEXCHE/turkish_agriculture_QA_llama2_22.6k}
}

Contact:

NIEXCHE (Fevzi KILAS)

Downloads last month
11
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train nieche/Llama-2-7b-agriculture-niexche