--- library_name: transformers base_model: bert-base-chinese tags: - generated_from_trainer datasets: - real-jiakai/chinese-squadv2 model-index: - name: chinese_squadv2 results: [] --- # bert-base-chinese-finetuned-squadv2 This model is a fine-tuned version of [bert-base-chinese](https://huggingface.co/bert-base-chinese) on the [Chinese SQuAD v2.0 dataset](https://huggingface.co/datasets/real-jiakai/chinese-squadv2). ## Model Description This model is designed for Chinese question answering tasks, specifically for extractive QA where the answer must be extracted from a given context paragraph. It can handle both answerable and unanswerable questions, following the SQuAD v2.0 format. Key features: - Based on BERT-base Chinese architecture - Supports both answerable and unanswerable questions - Trained on Chinese question-answer pairs - Optimized for extractive question answering ## Intended Uses & Limitations ### Intended Uses - Chinese extractive question answering - Reading comprehension tasks - Information extraction from Chinese text - Automated question answering systems ### Limitations - Performance is significantly better on unanswerable questions (76.65% accuracy) compared to answerable questions (36.41% accuracy) - Limited to extractive QA (cannot generate new answers) - May not perform well on domain-specific questions outside the training data - Designed for modern Chinese text, may not work well with classical Chinese or dialectal variations ## Training and Evaluation Data The model was trained on the Chinese SQuAD v2.0 dataset, which contains: Training Set: - Total examples: 90,027 - Answerable questions: 46,529 - Unanswerable questions: 43,498 Validation Set: - Total examples: 9,936 - Answerable questions: 3,991 - Unanswerable questions: 5,945 ## Training Procedure ### Training Hyperparameters - Learning rate: 3e-05 - Batch size: 12 - Evaluation batch size: 8 - Number of epochs: 5 - Optimizer: AdamW (β1=0.9, β2=0.999, ε=1e-08) - Learning rate scheduler: Linear - Maximum sequence length: 384 - Document stride: 128 - Training device: CUDA-enabled GPU ### Training Results Final evaluation metrics: - Overall Exact Match: 60.49% - Overall F1 Score: 60.54% - Answerable Questions: - Exact Match: 36.41% - F1 Score: 36.53% - Unanswerable Questions: - Exact Match: 76.65% - F1 Score: 76.65% ### Framework Versions - Transformers: 4.47.0.dev0 - PyTorch: 2.5.1+cu124 - Datasets: 3.1.0 - Tokenizers: 0.20.3 ## Usage ```python from transformers import AutoModelForQuestionAnswering, AutoTokenizer import torch # Load model and tokenizer model_name = "real-jiakai/bert-base-chinese-finetuned-squadv2" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForQuestionAnswering.from_pretrained(model_name) # Prepare the inputs question = "your_question" context = "your_context" inputs = tokenizer( question, context, add_special_tokens=True, return_tensors="pt" ) # Get the answer start_scores, end_scores = model(**inputs) start_index = torch.argmax(start_scores) end_index = torch.argmax(end_scores) answer = tokenizer.convert_tokens_to_string( tokenizer.convert_ids_to_tokens(inputs["input_ids"][0][start_index:end_index+1]) ) ``` ## Limitations and Bias The model shows significant performance disparity between answerable and unanswerable questions, which might indicate: 1. Dataset quality issues 2. Potential translation artifacts in the Chinese version of SQuAD 3. Imbalanced handling of answerable vs. unanswerable questions ## Ethics & Responsible AI Users should be aware that: - The model may reflect biases present in the training data - Performance varies significantly based on question type - Results should be validated for critical applications - The model should not be used as the sole decision-maker in critical systems ### Framework versions - Transformers 4.47.0.dev0 - Pytorch 2.5.1+cu124 - Datasets 3.1.0 - Tokenizers 0.20.3