dyyyyyyyy commited on
Commit
6660f8c
β€’
1 Parent(s): afc4430

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -21,7 +21,8 @@ We introduce GNER, a **G**enerative **N**amed **E**ntity **R**ecognition framewo
21
  * πŸ’» Code: [https://github.com/yyDing1/GNER/](https://github.com/yyDing1/GNER/)
22
  * πŸ“– Paper: [Rethinking Negative Instances for Generative Named Entity Recognition](https://arxiv.org/abs/2402.16602)
23
  * πŸ’Ύ Models in the πŸ€— HuggingFace Hub: [GNER-Models](https://huggingface.co/collections/dyyyyyyyy/gner-65dda2cb96c6e35c814dea56)
24
- * πŸ” Reproduction Materials: [Reproduction Materials](https://drive.google.com/drive/folders/1m2FqDgItEbSoeUVo-i18AwMvBcNkZD46?usp=drive_link)
 
25
 
26
  <p align="center">
27
  <img src="https://github.com/yyDing1/GNER/raw/main/assets/zero_shot_results.png">
@@ -49,9 +50,9 @@ pip install torch>=2.1.0 datasets>=2.17.0 deepspeed>=0.13.4 accelerate>=0.27.2 t
49
  Below is an example using `GNER-LLaMA`
50
  ```python
51
  >>> import torch
52
- >>> from transformers import AutoTokenizer, AutoModelForCasualLM
53
  >>> tokenizer = AutoTokenizer.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B")
54
- >>> model =AutoModelForCasualLM.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B", torch_dtype=torch.bfloat16).cuda()
55
  >>> model = model.eval()
56
  >>> instruction_template = "Please analyze the sentence provided, identifying the type of entity for each word on a token-by-token basis.\nOutput format is: word_1(label_1), word_2(label_2), ...\nWe'll use the BIO-format to label the entities, where:\n1. B- (Begin) indicates the start of a named entity.\n2. I- (Inside) is used for words within a named entity but are not the first word.\n3. O (Outside) denotes words that are not part of a named entity.\n"
57
  >>> sentence = "did george clooney make a musical in the 1980s"
@@ -61,7 +62,7 @@ Below is an example using `GNER-LLaMA`
61
  >>> inputs = tokenizer(instruction, return_tensors="pt").to("cuda")
62
  >>> outputs = model.generate(**inputs, max_new_tokens=640)
63
  >>> response = tokenizer.decode(outputs[0], skip_special_tokens=True)
64
- >>> response = response[preds.find("[/INST]") + len("[/INST]"):].strip()
65
  >>> print(response)
66
  "did(O) george(B-actor) clooney(I-actor) make(O) a(O) musical(B-genre) in(O) the(O) 1980s(B-year)"
67
  ```
 
21
  * πŸ’» Code: [https://github.com/yyDing1/GNER/](https://github.com/yyDing1/GNER/)
22
  * πŸ“– Paper: [Rethinking Negative Instances for Generative Named Entity Recognition](https://arxiv.org/abs/2402.16602)
23
  * πŸ’Ύ Models in the πŸ€— HuggingFace Hub: [GNER-Models](https://huggingface.co/collections/dyyyyyyyy/gner-65dda2cb96c6e35c814dea56)
24
+ * πŸ§ͺ Reproduction Materials: [Reproduction Materials](https://drive.google.com/drive/folders/1m2FqDgItEbSoeUVo-i18AwMvBcNkZD46?usp=drive_link)
25
+ * 🎨 Example Jupyter Notebooks: [GNER Notebook](https://github.com/yyDing1/GNER/blob/main/notebook.ipynb)
26
 
27
  <p align="center">
28
  <img src="https://github.com/yyDing1/GNER/raw/main/assets/zero_shot_results.png">
 
50
  Below is an example using `GNER-LLaMA`
51
  ```python
52
  >>> import torch
53
+ >>> from transformers import AutoTokenizer, AutoModelForCausalLM
54
  >>> tokenizer = AutoTokenizer.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B")
55
+ >>> model = AutoModelForCausalLM.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B", torch_dtype=torch.bfloat16).cuda()
56
  >>> model = model.eval()
57
  >>> instruction_template = "Please analyze the sentence provided, identifying the type of entity for each word on a token-by-token basis.\nOutput format is: word_1(label_1), word_2(label_2), ...\nWe'll use the BIO-format to label the entities, where:\n1. B- (Begin) indicates the start of a named entity.\n2. I- (Inside) is used for words within a named entity but are not the first word.\n3. O (Outside) denotes words that are not part of a named entity.\n"
58
  >>> sentence = "did george clooney make a musical in the 1980s"
 
62
  >>> inputs = tokenizer(instruction, return_tensors="pt").to("cuda")
63
  >>> outputs = model.generate(**inputs, max_new_tokens=640)
64
  >>> response = tokenizer.decode(outputs[0], skip_special_tokens=True)
65
+ >>> response = response[response.find("[/INST]") + len("[/INST]"):].strip()
66
  >>> print(response)
67
  "did(O) george(B-actor) clooney(I-actor) make(O) a(O) musical(B-genre) in(O) the(O) 1980s(B-year)"
68
  ```