|
--- |
|
language: zh |
|
license: mit |
|
tags: |
|
- text-classification |
|
- bert |
|
- chinese |
|
- vehicle-control |
|
- user-instructions |
|
datasets: |
|
- custom |
|
metrics: |
|
- accuracy |
|
- f1 |
|
- precision |
|
- recall |
|
--- |
|
|
|
# Vehicle User Instructions Classification - BERT (Chinese) |
|
|
|
This repository contains a fine-tuned BERT model for classifying vehicle user instructions in Chinese. The model is trained on a dataset of user instructions related to various vehicle control commands. |
|
## Preface |
|
This fine-tuned model is for Our team's UOW CSIT998 Professional Capstone Project. |
|
## Dataset |
|
|
|
The dataset used for training and evaluation consists of Chinese text instructions corresponding to different vehicle control commands. The distribution of the dataset is as follows: |
|
|
|
- Training set: 4499 samples |
|
- Validation set: 2249 samples |
|
- Test set: 2250 samples |
|
|
|
The instructions cover a range of vehicle control commands, including: |
|
|
|
``` |
|
{'开车窗': 0, '关左车门': 1, '关右前车窗': 2, '关闭引擎': 3, '关左前车窗': 4, '开右前车窗': 5, '关左后车窗': 6, '开左后车窗': 7, '开后备箱': 8, '关车门': 9, '关车窗': 10, '开左前车窗': 11, '关右后车窗': 12, '开敞篷': 13, '开左侧车窗': 14, '关敞篷': 15, '喇叭': 16, '开右后车窗': 17, '开右车门': 18, '停车点1': 19, '关后备箱': 20, '关右车门': 21, '开左车门': 22, '停车点2': 23, '开车门': 24, '打开引擎': 25, '关左侧车窗': 26} |
|
``` |
|
|
|
## Model |
|
|
|
The model is based on the pre-trained Chinese BERT model (`bert-base-chinese`). It has been fine-tuned on the vehicle user instructions dataset using the following training arguments: |
|
|
|
```python |
|
training_args = TrainingArguments( |
|
output_dir='', |
|
do_train=True, |
|
do_eval=True, |
|
num_train_epochs=3, |
|
per_device_train_batch_size=16, |
|
per_device_eval_batch_size=32, |
|
warmup_steps=100, |
|
weight_decay=0.01, |
|
logging_strategy='steps', |
|
logging_dir='', |
|
logging_steps=50, |
|
evaluation_strategy="steps", |
|
eval_steps=50, |
|
save_strategy="steps", |
|
save_steps=200, |
|
fp16=True, |
|
load_best_model_at_end=True |
|
) |
|
``` |
|
|
|
## Training Results |
|
|
|
The model was trained for 3 epochs, and the training progress can be summarized as follows: |
|
|
|
| Step | Training Loss | Validation Loss | Accuracy | F1 | Precision | Recall | |
|
|------|---------------|-----------------|----------|--------|-----------|---------| |
|
| 50 | 3.257000 | 2.964479 | 0.168519 | 0.089801 | 0.229036 | 0.126555 | |
|
| 100 | 2.525000 | 1.711695 | 0.648288 | 0.532127 | 0.595545 | 0.590985 | |
|
| 150 | 1.197200 | 0.628560 | 0.921298 | 0.888212 | 0.892879 | 0.890719 | |
|
| ... | ... | ... | ... | ... | ... | ... | |
|
| 8000 | 0.045900 | 0.136842 | 0.969320 | 0.969658 | 0.969638 | 0.970056 | |
|
|
|
## Evaluation |
|
|
|
The trained model was evaluated on the training, validation, and test sets, achieving the following performance: |
|
|
|
| | eval_loss | eval_Accuracy | eval_F1 | eval_Precision | eval_Recall | |
|
|-------|-----------|---------------|----------|----------------|-------------| |
|
| train | 0.036020 | 0.991331 | 0.991048 | 0.991615 | 0.990673 | |
|
| val | 0.136842 | 0.969320 | 0.969658 | 0.969638 | 0.970056 | |
|
| test | 0.126695 | 0.974222 | 0.975473 | 0.975814 | 0.975435 | |
|
|
|
The model achieves high accuracy, F1 score, precision, and recall on all three datasets, indicating its effectiveness in classifying vehicle user instructions. |
|
|
|
## Usage |
|
|
|
To use the fine-tuned model for inference, you can utilize the Hugging Face Inference API. Here's an example of how to make a request to the API using Python: |
|
|
|
```python |
|
import requests |
|
|
|
API_URL = "https://api-inference.huggingface.co/models/lindsey-chang/vehicle-user-instructions-classification-bert-chinese" |
|
headers = {"Authorization": f"Bearer {API_TOKEN}"} |
|
|
|
def query(payload): |
|
response = requests.post(API_URL, headers=headers, json=payload) |
|
return response.json() |
|
|
|
# Example usage |
|
input_text = "请打开车窗" |
|
output = query({"inputs": input_text}) |
|
print(output) |
|
``` |
|
|
|
Replace `your-username` with your Hugging Face username and `API_TOKEN` with your personal API token, which you can create in your Hugging Face account settings. |
|
|
|
The model will return the predicted class index for the input instruction. You can map the class index back to the corresponding vehicle control command using the provided class labels. |
|
|
|
|