Phi-2 model fine-tuned for named entity recognition task

The model was fine-tuned using one quarter of the ConLL 2012 OntoNotes v5 dataset.

Dataset Source: conll2012_ontonotesv5
Subset Used: English_v12
Number of Examples: 87,265

The prompts and expected outputs were constructed as described in [1].

Example input:

Instruct: I am an excelent linquist. The task is to label organization entities in the given sentence. Below are some examples

Input: A spokesman for B. A. T said of the amended filings that,`` It would appear that nothing substantive has changed.
Output: A spokesman for @@B. A. T## said of the amended filings that,`` It would appear that nothing substantive has changed.

Input: Since NBC's interest in the Qintex bid for MGM / UA was disclosed, Mr. Wright has n't been available for comment.
Output: Since @@NBC##'s interest in the @@Qintex## bid for @@MGM / UA## was disclosed, Mr. Wright has n't been available for comment.

Input: You know news organizations demand total transparency whether you're General Motors or United States government /.
Output: You know news organizations demand total transparency whether you're @@General Motors## or United States government /.

Input: We respectfully invite you to watch a special edition of Across China.
Output:

Expected output:

We respectfully invite you to watch a special edition of @@Across China##.

This model is trained to recognize the named entity categories

person
nationalities or religious or political groups
facility
organization
geopolitical entity
location
product
date
time expression
percentage
monetary value
quantity
event
work of art
law/legal reference
language name

Model Trained Using AutoTrain

This model was trained using SFT AutoTrain trainer. For more information, please visit AutoTrain.

Hyperparameters:

{
    "model": "microsoft/phi-2",
    "valid_split": null,
    "add_eos_token": false,
    "block_size": 1024,
    "model_max_length": 1024,
    "padding": "right",
    "trainer": "sft",
    "use_flash_attention_2": false,
    "disable_gradient_checkpointing": false,
    "evaluation_strategy": "epoch",
    "save_total_limit": 1,
    "save_strategy": "epoch",
    "auto_find_batch_size": false,
    "mixed_precision": "bf16",
    "lr": 0.0002,
    "epochs": 1,
    "batch_size": 1,
    "warmup_ratio": 0.1,
    "gradient_accumulation": 4,
    "optimizer": "adamw_torch",
    "scheduler": "linear",
    "weight_decay": 0.01,
    "max_grad_norm": 1.0,
    "seed": 42,
    "apply_chat_template": false,
    "quantization": "int4",
    "target_modules": null,
    "merge_adapter": false,
    "peft": true,
    "lora_r": 16,
    "lora_alpha": 32,
    "lora_dropout": 0.05,
    "dpo_beta": 0.1,
}

Usage


from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "pahautelman/phi2-ner-v1"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path
).eval()

prompt = 'Label the person entities in the given sentence: Russian President Vladimir Putin is due to arrive in Havana a few hours from now to become the first post-Soviet leader to visit Cuba.'

inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors='pt')
outputs = model.generate(
    inputs.to(model.device),
    max_new_tokens=9,
    do_sample=False,
)
output = tokenizer.batch_decode(outputs)[0]

# Model response: "Output: Russian President, Vladimir Putin"
print(output)

References:

[1] Wang et al., GPT-NER: Named entity recognition via large language models 2023

pahautelman
/

phi2-ner-v1

Phi-2 model fine-tuned for named entity recognition task

Model Trained Using AutoTrain

Usage

References:

Dataset used to train pahautelman/phi2-ner-v1