--- tags: - autotrain - text-generation - transformers - named entity recognition widget: - text: 'I love AutoTrain because ' license: mit datasets: - conll2012_ontonotesv5 language: - en --- # Phi-2 model fine-tuned for named entity recognition task The model was fine-tuned using one quarter of the ConLL 2012 OntoNotes v5 dataset. - Dataset Source: [conll2012_ontonotesv5](https://huggingface.co/datasets/conll2012_ontonotesv5) - Subset Used: English_v12 - Number of Examples: 87,265 The prompts and expected outputs were constructed as described in [1]. Example input: ```md Instruct: I am an excelent linquist. The task is to label organization entities in the given sentence. Below are some examples Input: A spokesman for B. A. T said of the amended filings that,`` It would appear that nothing substantive has changed. Output: A spokesman for @@B. A. T## said of the amended filings that,`` It would appear that nothing substantive has changed. Input: Since NBC's interest in the Qintex bid for MGM / UA was disclosed, Mr. Wright has n't been available for comment. Output: Since @@NBC##'s interest in the @@Qintex## bid for @@MGM / UA## was disclosed, Mr. Wright has n't been available for comment. Input: You know news organizations demand total transparency whether you're General Motors or United States government /. Output: You know news organizations demand total transparency whether you're @@General Motors## or United States government /. Input: We respectfully invite you to watch a special edition of Across China. Output: ``` Expected output: ```md We respectfully invite you to watch a special edition of @@Across China##. ``` This model is trained to recognize the named entity categories - person - nationalities or religious or political groups - facility - organization - geopolitical entity - location - product - date - time expression - percentage - monetary value - quantity - event - work of art - law/legal reference - language name # Model Trained Using AutoTrain This model was trained using **SFT** AutoTrain trainer. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain). Hyperparameters: ```json { "model": "microsoft/phi-2", "valid_split": null, "add_eos_token": false, "block_size": 1024, "model_max_length": 1024, "padding": "right", "trainer": "sft", "use_flash_attention_2": false, "disable_gradient_checkpointing": false, "evaluation_strategy": "epoch", "save_total_limit": 1, "save_strategy": "epoch", "auto_find_batch_size": false, "mixed_precision": "bf16", "lr": 0.0002, "epochs": 1, "batch_size": 1, "warmup_ratio": 0.1, "gradient_accumulation": 4, "optimizer": "adamw_torch", "scheduler": "linear", "weight_decay": 0.01, "max_grad_norm": 1.0, "seed": 42, "apply_chat_template": false, "quantization": "int4", "target_modules": null, "merge_adapter": false, "peft": true, "lora_r": 16, "lora_alpha": 32, "lora_dropout": 0.05, "dpo_beta": 0.1, } ``` # Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_path = "pahautelman/phi2-ner-v1" tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForCausalLM.from_pretrained( model_path ).eval() prompt = 'Label the person entities in the given sentence: Russian President Vladimir Putin is due to arrive in Havana a few hours from now to become the first post-Soviet leader to visit Cuba.' inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors='pt') outputs = model.generate( inputs.to(model.device), max_new_tokens=9, do_sample=False, ) output = tokenizer.batch_decode(outputs)[0] # Model response: "Output: Russian President, Vladimir Putin" print(output) ``` # References: [1] Wang et al., GPT-NER: Named entity recognition via large language models 2023