billm-mistral-7b-conll03-ner

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.2046
Precision: 0.9273
Recall: 0.9393
F1: 0.9333
Accuracy: 0.9864

Inference

python -m pip install -U billm==0.1.1

from transformers import AutoTokenizer, pipeline
from peft import PeftModel, PeftConfig
from billm import MistralForTokenClassification


label2id = {'O': 0, 'B-PER': 1, 'I-PER': 2, 'B-ORG': 3, 'I-ORG': 4, 'B-LOC': 5, 'I-LOC': 6, 'B-MISC': 7, 'I-MISC': 8}
id2label = {v: k for k, v in label2id.items()}
model_id = 'WhereIsAI/billm-mistral-7b-conll03-ner'
tokenizer = AutoTokenizer.from_pretrained(model_id)
peft_config = PeftConfig.from_pretrained(model_id)
model = MistralForTokenClassification.from_pretrained(
    peft_config.base_model_name_or_path,
    num_labels=len(label2id), id2label=id2label, label2id=label2id
)
model = PeftModel.from_pretrained(model, model_id)
# merge_and_unload is necessary for inference
model = model.merge_and_unload()

token_classifier = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
sentence = "I live in Hong Kong. I am a student at Hong Kong PolyU."
tokens = token_classifier(sentence)
print(tokens)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
0.0499	1.0	1756	0.1085	0.9196	0.9287	0.9241	0.9845
0.0233	2.0	3512	0.0997	0.9249	0.9226	0.9237	0.9845
0.0097	3.0	5268	0.1343	0.9292	0.9386	0.9339	0.9870
0.0036	4.0	7024	0.1651	0.9245	0.9386	0.9315	0.9864
0.0012	5.0	8780	0.1839	0.9257	0.9373	0.9315	0.9863
0.0005	6.0	10536	0.2027	0.9258	0.9386	0.9321	0.9864
0.0002	7.0	12292	0.2022	0.9276	0.9384	0.9330	0.9864
0.0002	8.0	14048	0.2040	0.9274	0.9388	0.9331	0.9864
0.0001	9.0	15804	0.2048	0.9270	0.9393	0.9331	0.9864
0.0001	10.0	17560	0.2046	0.9273	0.9393	0.9333	0.9864

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.0.1
Datasets 2.16.0
Tokenizers 0.15.0

Citation

@inproceedings{li2024bellm,
    title = "BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings",
    author = "Li, Xianming and Li, Jing",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics",
    year = "2024",
    publisher = "Association for Computational Linguistics"
}

@article{li2023label,
  title={Label supervised llama finetuning},
  author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},
  journal={arXiv preprint arXiv:2310.01208},
  year={2023}
}

WhereIsAI
/

billm-mistral-7b-conll03-ner

billm-mistral-7b-conll03-ner

Inference

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Citation

Model tree for WhereIsAI/billm-mistral-7b-conll03-ner

Evaluation results