metadata

library_name: transformers
base_model: bert-base-chinese
tags:
  - generated_from_trainer
datasets:
  - real-jiakai/chinese-squadv2
model-index:
  - name: chinese_squadv2
    results: []

bert-base-chinese-finetuned-squadv2

This model is a fine-tuned version of bert-base-chinese on the Chinese SQuAD v2.0 dataset.

Model Description

This model is designed for Chinese question answering tasks, specifically for extractive QA where the answer must be extracted from a given context paragraph. It can handle both answerable and unanswerable questions, following the SQuAD v2.0 format.

Key features:

Based on BERT-base Chinese architecture
Supports both answerable and unanswerable questions
Trained on Chinese question-answer pairs
Optimized for extractive question answering

Intended Uses & Limitations

Intended Uses

Chinese extractive question answering
Reading comprehension tasks
Information extraction from Chinese text
Automated question answering systems

Limitations

Performance is significantly better on unanswerable questions (76.65% accuracy) compared to answerable questions (36.41% accuracy)
Limited to extractive QA (cannot generate new answers)
May not perform well on domain-specific questions outside the training data
Designed for modern Chinese text, may not work well with classical Chinese or dialectal variations

Training and Evaluation Data

The model was trained on the Chinese SQuAD v2.0 dataset, which contains:

Training Set:

Total examples: 90,027
Answerable questions: 46,529
Unanswerable questions: 43,498

Validation Set:

Total examples: 9,936
Answerable questions: 3,991
Unanswerable questions: 5,945

Training Procedure

Training Hyperparameters

Learning rate: 3e-05
Batch size: 12
Evaluation batch size: 8
Number of epochs: 5
Optimizer: AdamW (β1=0.9, β2=0.999, ε=1e-08)
Learning rate scheduler: Linear
Maximum sequence length: 384
Document stride: 128
Training device: CUDA-enabled GPU

Training Results

Final evaluation metrics:

Overall Exact Match: 60.49%
Overall F1 Score: 60.54%
Answerable Questions:
- Exact Match: 36.41%
- F1 Score: 36.53%
Unanswerable Questions:
- Exact Match: 76.65%
- F1 Score: 76.65%

Framework Versions

Transformers: 4.47.0.dev0
PyTorch: 2.5.1+cu124
Datasets: 3.1.0
Tokenizers: 0.20.3

Usage

from transformers import AutoModelForQuestionAnswering, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "real-jiakai/bert-base-chinese-finetuned-squadv2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)

# Prepare the inputs
question = "your_question"
context = "your_context"
inputs = tokenizer(
    question,
    context,
    add_special_tokens=True,
    return_tensors="pt"
)

# Get the answer
start_scores, end_scores = model(**inputs)
start_index = torch.argmax(start_scores)
end_index = torch.argmax(end_scores)
answer = tokenizer.convert_tokens_to_string(
    tokenizer.convert_ids_to_tokens(inputs["input_ids"][0][start_index:end_index+1])
)

Limitations and Bias

The model shows significant performance disparity between answerable and unanswerable questions, which might indicate:

Dataset quality issues
Potential translation artifacts in the Chinese version of SQuAD
Imbalanced handling of answerable vs. unanswerable questions

Ethics & Responsible AI

Users should be aware that:

The model may reflect biases present in the training data
Performance varies significantly based on question type
Results should be validated for critical applications
The model should not be used as the sole decision-maker in critical systems

Framework versions

Transformers 4.47.0.dev0
Pytorch 2.5.1+cu124
Datasets 3.1.0
Tokenizers 0.20.3