dyyyyyyyy
/

GNER-LLaMA-7B

@@ -21,7 +21,8 @@ We introduce GNER, a **G**enerative **N**amed **E**ntity **R**ecognition framewo
 * 💻 Code: [https://github.com/yyDing1/GNER/](https://github.com/yyDing1/GNER/)
 * 📖 Paper: [Rethinking Negative Instances for Generative Named Entity Recognition](https://arxiv.org/abs/2402.16602)
 * 💾 Models in the 🤗 HuggingFace Hub: [GNER-Models](https://huggingface.co/collections/dyyyyyyyy/gner-65dda2cb96c6e35c814dea56)
-* 🔁 Reproduction Materials: [Reproduction Materials](https://drive.google.com/drive/folders/1m2FqDgItEbSoeUVo-i18AwMvBcNkZD46?usp=drive_link)
 <p align="center">
 <img src="https://github.com/yyDing1/GNER/raw/main/assets/zero_shot_results.png">
@@ -49,9 +50,9 @@ pip install torch>=2.1.0 datasets>=2.17.0 deepspeed>=0.13.4 accelerate>=0.27.2 t
 Below is an example using `GNER-LLaMA`
 ```python
 >>> import torch
->>> from transformers import AutoTokenizer, AutoModelForCasualLM
 >>> tokenizer = AutoTokenizer.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B")
->>> model =AutoModelForCasualLM.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B", torch_dtype=torch.bfloat16).cuda()
 >>> model = model.eval()
 >>> instruction_template = "Please analyze the sentence provided, identifying the type of entity for each word on a token-by-token basis.\nOutput format is: word_1(label_1), word_2(label_2), ...\nWe'll use the BIO-format to label the entities, where:\n1. B- (Begin) indicates the start of a named entity.\n2. I- (Inside) is used for words within a named entity but are not the first word.\n3. O (Outside) denotes words that are not part of a named entity.\n"
 >>> sentence = "did george clooney make a musical in the 1980s"
@@ -61,7 +62,7 @@ Below is an example using `GNER-LLaMA`
 >>> inputs = tokenizer(instruction, return_tensors="pt").to("cuda")
 >>> outputs = model.generate(**inputs, max_new_tokens=640)
 >>> response = tokenizer.decode(outputs[0], skip_special_tokens=True)
->>> response = response[preds.find("[/INST]") + len("[/INST]"):].strip()
 >>> print(response)
 "did(O) george(B-actor) clooney(I-actor) make(O) a(O) musical(B-genre) in(O) the(O) 1980s(B-year)"
 ```

 * 💻 Code: [https://github.com/yyDing1/GNER/](https://github.com/yyDing1/GNER/)
 * 📖 Paper: [Rethinking Negative Instances for Generative Named Entity Recognition](https://arxiv.org/abs/2402.16602)
 * 💾 Models in the 🤗 HuggingFace Hub: [GNER-Models](https://huggingface.co/collections/dyyyyyyyy/gner-65dda2cb96c6e35c814dea56)
+* 🧪 Reproduction Materials: [Reproduction Materials](https://drive.google.com/drive/folders/1m2FqDgItEbSoeUVo-i18AwMvBcNkZD46?usp=drive_link)
+* 🎨 Example Jupyter Notebooks: [GNER Notebook](https://github.com/yyDing1/GNER/blob/main/notebook.ipynb)
 <p align="center">
 <img src="https://github.com/yyDing1/GNER/raw/main/assets/zero_shot_results.png">
 Below is an example using `GNER-LLaMA`
 ```python
 >>> import torch
+>>> from transformers import AutoTokenizer, AutoModelForCausalLM
 >>> tokenizer = AutoTokenizer.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B")
+>>> model = AutoModelForCausalLM.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B", torch_dtype=torch.bfloat16).cuda()
 >>> model = model.eval()
 >>> instruction_template = "Please analyze the sentence provided, identifying the type of entity for each word on a token-by-token basis.\nOutput format is: word_1(label_1), word_2(label_2), ...\nWe'll use the BIO-format to label the entities, where:\n1. B- (Begin) indicates the start of a named entity.\n2. I- (Inside) is used for words within a named entity but are not the first word.\n3. O (Outside) denotes words that are not part of a named entity.\n"
 >>> sentence = "did george clooney make a musical in the 1980s"
 >>> inputs = tokenizer(instruction, return_tensors="pt").to("cuda")
 >>> outputs = model.generate(**inputs, max_new_tokens=640)
 >>> response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+>>> response = response[response.find("[/INST]") + len("[/INST]"):].strip()
 >>> print(response)
 "did(O) george(B-actor) clooney(I-actor) make(O) a(O) musical(B-genre) in(O) the(O) 1980s(B-year)"
 ```