--- tags: - autotrain - text-generation widget: - text: 'I love AutoTrain because ' license: mit datasets: - conll2012_ontonotesv5 language: - en --- # Phi-2 model fine-tuned for named entity recognition task The model was fine-tuned using one quarter of the ConLL 2012 OntoNotes v5 dataset. - Dataset Source: [conll2012_ontonotesv5](https://huggingface.co/datasets/conll2012_ontonotesv5) - Subset Used: English_v12 - Number of Examples: 21,817 The prompts and expected outputs were constructed as described in [1]. Example input: ```md I am an excelent linquist. The task is to label location entities in the given sentence. Below are some examples Input: Only France and Britain backed Fischler's proposal. Output: Only @@France## and @@Britain## backed Fischler's proposal. Input: Germany imported 47,000 sheeps from Britain last year, nearly half of total imports. Output: @@Germany## imported 47,000 sheeps from @@Britain## last year, nearly half of total imports. Input: It brought in 4275 tonnes of British mutton, some 10% of overall imports. Output: It brought in 4275 tonnes of British mutton, some 10% of overall imports. Input: China says Taiwan spoils atmosphere for talks. Output: ``` Expected output: ```md @@China## says @@Taiwan## spoils atmosphere for talks. ``` # Model Trained Using AutoTrain This model was trained using **DPO** AutoTrain trainer. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain). Hyperparameters: ```json { "model": "microsoft/phi-2", "train_split": "train", "valid_split": null, "add_eos_token": true, "block_size": 1024, "model_max_length": 2048, "padding": "right", "trainer": "dpo", "use_flash_attention_2": false, "log": "tensorboard", "disable_gradient_checkpointing": false, "logging_steps": -1, "evaluation_strategy": "epoch", "save_total_limit": 1, "save_strategy": "epoch", "auto_find_batch_size": false, "mixed_precision": "bf16", "lr": 3e-05, "epochs": 1, "batch_size": 2, "warmup_ratio": 0.05, "gradient_accumulation": 1, "optimizer": "adamw_torch", "scheduler": "linear", "weight_decay": 0.0, "max_grad_norm": 1.0, "seed": 42, "apply_chat_template": false, "quantization": "int4", "target_modules": "", "merge_adapter": false, "peft": true, "lora_r": 16, "lora_alpha": 32, "lora_dropout": 0.05, "model_ref": null, "dpo_beta": 0.1, } ``` # Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_path = "pahautelman/phi2-ner-dpo-v1" tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForCausalLM.from_pretrained( model_path ).eval() prompt = 'Label the person entities in the given sentence: Russian President Vladimir Putin is due to arrive in Havana a few hours from now to become the first post-Soviet leader to visit Cuba.' inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors='pt') outputs = model.generate( inputs.to(model.device), max_new_tokens=9, do_sample=False, ) output = tokenizer.batch_decode(outputs)[0] # Model response: "Answer: Russian President, Vladimir Putin" print(output) ``` # References: [1] Wang et al., GPT-NER: Named entity recognition via large language models 2023