BERT Fine-Tuned for Question Answering (SQuAD)

Model Description

This model is a fine-tuned version of BERT-base-cased, specifically optimized for the task of question answering. It was trained on the SQuAD (Stanford Question Answering Dataset) to understand and extract relevant information from a given context, based on a provided question. BERT is a transformer-based model that uses attention mechanisms to improve the contextual understanding of text, which makes it well-suited for question-answering tasks.

Intended Uses & Limitations

Intended Uses:

Question Answering: This model can be used to extract answers from a given context based on a specific question. It's suitable for applications such as chatbots, virtual assistants, and customer support systems where retrieving relevant information is crucial.
Information Retrieval: Useful in scenarios requiring quick and accurate information extraction from large bodies of text.

Limitations:

Domain Adaptation: The model may not perform well on domains that are significantly different from the training data (e.g., technical manuals, medical documents).
Context Size Limitation: Due to the input length limit of BERT (512 tokens), the context must be relatively short, or it needs to be chunked appropriately.
Bias and Fairness: The model may reflect biases present in the SQuAD dataset and its pretraining corpus, potentially affecting the impartiality of answers.

How to Use

To use this model for question answering, you can utilize the Hugging Face transformers library. Here’s a Python code example:

from transformers import pipeline

model_checkpoint = "Ashaduzzaman/bert-finetuned-squad"
question_answerer = pipeline("question-answering", model=model_checkpoint)

question = "What is the name of the architectures?"
context = """
🤗 Transformers (formerly known as pytorch-transformers and pytorch-pretrained-
and pytorch-nlp) provides general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural
Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages and
with state-of-the-art performance on SQuAD, GLUE, AWS Glue, and other benchmarks.
"""

result = question_answerer(question=question, context=context)
print(result['answer'])

Training and Evaluation Data

Dataset Used: The model was fine-tuned on the SQuAD dataset, a benchmark dataset for training and evaluating question-answering models. SQuAD provides a collection of questions and corresponding context paragraphs, with labeled answers.

Training Procedure

The model was trained using the Hugging Face transformers library with the following hyperparameters:

Learning Rate: 2e-05
Training Batch Size: 8
Evaluation Batch Size: 8
Seed: 42
Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
Learning Rate Scheduler: Linear
Number of Epochs: 1
Mixed Precision Training: Native AMP (Automatic Mixed Precision)

Training Results

Final Training Loss: 1.22
Exact Match (EM): 79.99
F1 Score: 87.55

Evaluation

The model's performance was evaluated using standard SQuAD metrics, including Exact Match (EM) and F1 score. These metrics measure the model's ability to provide accurate and precise answers to the questions based on the context.

Framework Versions

Transformers: 4.42.4
PyTorch: 2.3.1+cu121
Datasets: 2.21.0
Tokenizers: 0.19.1

ashaduzzaman
/

bert-finetuned-squad