silma-ai/SILMA-Kashif-2B-Instruct-v1.0

SILMA Kashif v1.0 (The Arabic RAG Model)

SILMA Kashif 2B Instruct v1.0 is the premier release within the SILMA Kashif Family of models, specifically designed for RAG (Retrieval-Augmented Generation) tasks
Kashif excels in a specific task, answering questions based on contextual pieces in both Arabic and English. In addition, the model is also capable of performing Entity Extraction tasks as a minor skill
SILMA Kashif 2B v1.0 stands out as the top-performing open model in RAG within the 3-9 billion parameter range based on our evaluations using SILMA RAGQA Benchmark
SILMA Kashif is built on the powerful foundational models of Google Gemma, merging their strengths to deliver unmatched performance for users
Kashif is an open-weight model, free to use in accordance with our open license
Finally, the model comes with a context length of 12k

Important note: Kashif is a specialized model which should ONLY be used in RAG setups. If you are looking for a general purpose model, please refer to SILMA 9B Instruct v1.0

Model Skills and Capabilities

The model underwent intensive training to master a wide range of tasks and excel in performance.

The ability to answer questions in Arabic and English
The ability to deal with short and long contexts
The ability to provide short and long answers effectively
The ability to answer complex numerical questions
The ability to answer questions based on tabular data
Answering multi-hop questions: The ability to answer a single question using pieces of data from multiple paragraphs
Negative rejection: The ability to identify and exclude inaccurate answers, and provide a more accurate statement such as "The answer cannot be found in the given context"
Multi-domains: The ability to answer questions based on texts from different fields such as finance, medical, legal, etc.
The ability to deal with ambiguous contexts
The ability to extract entities from text
Ability to deal with diverse and complex prompts

Model Evaluation

Dataset	exact_match	rouge1	bleu	bertscore
ragbench-finqa-en-test	0.000	0.587	0.321	0.760
ragbench-tatqa-ar-test	0.000	0.484	0.130	0.774
ragbench-tatqa-en-test	0.059	0.646	0.423	0.808
rag-instruct-benchmark-tester-en	0.370	0.683	0.196	0.791
ragbench-expertqa-en-test	0.000	0.465	0.151	0.677
ragbench-msmarco-ar-test	0.000	0.144	0.096	0.781
sciq-ar-test	0.170	0.000	0.000	0.753
ragbench-covidqa-en-test	0.020	0.521	0.242	0.734
ragbench-emanual-ar-test	0.000	0.237	0.159	0.806
ragbench-finqa-ar-test	0.000	0.377	0.109	0.780
xquad-r-validation-en	0.120	0.326	0.041	0.603
ragbench-emanual-en-test	0.000	0.565	0.288	0.722
xquad-r-ar-validation	0.070	0.130	0.042	0.698
boolq-ar-test	0.450	0.000	0.000	0.700
ragbench-hotpotqa-en-test	0.060	0.732	0.503	0.837
ragbench-covidqa-ar-test	0.000	0.179	0.104	0.783
ragbench-msmarco-en-test	0.020	0.491	0.207	0.729
### Benchmark Average Scores	0.079	0.386	0.177	0.749

SILMA RAG QA Benchmark Score: 0.3478

SILMA AI

silma.ai is a leading GenAI startup that excels in building and tailoring cutting-edge Large Language Models (LLMs) and AI technologies for the Arabic language

Usage

Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with:

pip install -U transformers

Then, copy the snippet from the section below

Running with the `pipeline` API

import torch
from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model="silma-ai/SILMA-Kashif-2B-Instruct-v1.0",
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="cuda",  # replace with "mps" to run on a Mac device
)

messages = [
    {"role": "user", "content":
"""
أجب على السؤال بناءً على السياق أدناه

السياق:
تشمل الاتفاقيات رسوم حمل سنوية ثابت قدها 30 مليون جنيه إسترليني للقنوات نظراً لأن كلاً من مزوديها قادرين على تأمين دفعات إضافية إذا ما حققت هذه القنوات أهدافاً متعلقةً بالأداء.
لا يوجد حالياً ما يشير إلى ما إذا كان الاتفاق الجديد يشمل محتوىً إضافياً كالفيديو عند الطلب والدقة العالية ، كذلك الذي سبق أن قدمته بي سكاي بي.
وقد وافقت كل من بي سكاي بي و فيرجين ميديا على إنهاء الدعاوى القضائية بالمحكمة العليا ضد بعضهما بشأن معاليم الحمل التي تخص قنواتهما الأساسية.

السؤال: ماسم الشركة التي وافقت على إنهاء دعواها القضائية ضد بي سكاي بي بالمحكمة العليا؟
الإجابة:
"""},
]

outputs = pipe(messages, max_new_tokens=600)
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
print(assistant_response)

Response:

فيرجين ميديا

"وقد وافقت كل من بي سكاي بي و فيرجين ميديا على إنهاء الدعاوى القضائية بالمحكمة العليا ضد بعضهما بشأن معاليم الحمل التي تخص قنواتهما الأساسية."

Note: for advanced usage examples such as multi-gpu, quantization or chat templates, please refer to SILMA v1.0 examples

Runing with Ollama

ollama run hf.co/silma-ai/SILMA-Kashif-2B-Instruct-v1.0-GGUF

Prompt Format

Here is a recommended way to prompt the model. You can modify the prompt based on your specific requirements, but if you encounter any challenges, following the format below in which we used to train the model may be helpful.

Arabic

أجب على السؤال بناءً على السياق أدناه

السياق:
.....
.....

السؤال: ...
الإجابة: ...

English

Answer the following question using the provided context below

Context: 
.....
.....

Question: ...
Answer: ...

GPU Requirements

The following are the minimum/recommended GPU requirements for running inference:

Recommended
- At least one GPU with a minimum of 24 GB of GPU memory
- Examples: Nvidia RTX 4090
Minimum
- At least one GPU with 8 GB of GPU memory
- Examples: Nvidia RTX 3070, RTX 3080 or T4

Effect of Quantization

We have seen 2.6% drop in score (to 0.338) for the same model quantized 4bit

Citation

@article{silma_01_2025,
    title={SILMA Kashif 2B Instruct v1.0},
    url={https://huggingface.co./silma-ai/SILMA-Kashif-2B-Instruct-v1.0},
    publisher={SILMA AI},
    author={Silma Team},
    year={2025}
}

Intended Usage

The model should only be used in question answering use-cases such as RAG
The model can also be used to extract entities from text

Limitations

Because it has few parameters, we've noticed that the model isn't very effective for handling complex numerical and financial reasoning, such as solving tricky calculations
The model has been trained specifically for text-based question answering, which may limit its ability to perform tasks beyond this scope, including simple tasks

silma-ai
/

SILMA-Kashif-2B-Instruct-v1.0

SILMA Kashif v1.0 (The Arabic RAG Model)

Model Skills and Capabilities

Model Evaluation

SILMA AI

Usage

Running with the `pipeline` API

Runing with Ollama

Prompt Format

GPU Requirements

Effect of Quantization

Citation

Intended Usage

Limitations

Model tree for silma-ai/SILMA-Kashif-2B-Instruct-v1.0

Collection including silma-ai/SILMA-Kashif-2B-Instruct-v1.0

Arabic LLM Models

Evaluation results

SILMA Kashif v1.0 (The Arabic RAG Model)

Model Skills and Capabilities

Model Evaluation

SILMA AI

Usage

Running with the pipeline API

Runing with Ollama

Prompt Format

GPU Requirements

Effect of Quantization

Citation

Intended Usage

Limitations

Model tree for silma-ai/SILMA-Kashif-2B-Instruct-v1.0

Collection including silma-ai/SILMA-Kashif-2B-Instruct-v1.0

Evaluation results

Running with the `pipeline` API