File size: 2,473 Bytes
f443c59
 
 
 
 
 
 
 
 
 
 
 
 
 
f0ccf5e
 
 
f443c59
 
 
f0ccf5e
 
f443c59
 
 
 
 
 
 
 
 
f0ccf5e
 
 
 
 
 
 
d5426d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f0ccf5e
f443c59
f0ccf5e
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
license: cc-by-nc-4.0
language:
- en
- ja
- zh
metrics:
- f1
tags:
- Information Extraction
- NER
---
This is a model of paper. Base in LLaMA3-8B-Instruction. Meta https://huggingface.co/meta-llama/Meta-Llama-3-8B


***Please must use following format to use OIELLM.*** And extraction information from input text or sentence.


The OIELLM support 3 languages (English, Chinese and Japanese). And you must use task instruct words to define kind of task.



![image/png](https://cdn-uploads.huggingface.co/production/uploads/629311a945f405d06678224b/9zG-y_-YbPx-z2woGCx-8.png)

The following is input and output format:
    {
        "input": "In 1953, filming of \"On the Waterfront\" starring Marlon Brando began, and Kazan struggled with Spiegel's persistent budget cuts and managed to complete the film, which was released the following year in 1954 and became a huge hit with support from the laborer class./NER",
        "output": "Literature/NER/:Person;Marlon Brando:Product Name;On the Waterfront:Person;Kazan:Person;Spiegel"
    }


The from_pretrain class is use **AutoTokenizer and AutoModelForCausalLM.**

**If you have any question. You can leave the words in this commutiy. Or contact me from paper's E-mail directly.**








**Let me conclude by thanking the contributors to the MMM dataset for contributing the fundamental dataset. And the pioneering researchers who selflessly contributed.**

**1. Japanese Wikipedia NER dataset    Takahiro Omi  https://github.com/stockmarkteam/ner-wikipedia-dataset**

**2. JGLUE: Japanese General Language Understanding Evaluation    Kentaro Kurihara, Daisuke Kawahara, Tomohide Shibata   https://github.com/yahoojapan/JGLUE?tab=readme-ov-file**

**3. livedoor news corpus   関口宏司  https://www.rondhuit.com/download.html**

**4. UniversalNER    Wenxuan Zhou   https://arxiv.org/abs/2308.03279**



Paper address and cite information: https://arxiv.org/abs/2407.10953

@misc{gan2024mmmmultilingualmutualreinforcement,
      title={MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models}, 
      author={Chengguang Gan and Qingyu Yin and Xinyang He and Hanjun Wei and Yunhao Liang and Younghun Lim and Shijian Wang and Hexiang Huang and Qinghao Zhang and Shiwen Ni and Tatsunori Mori},
      year={2024},
      eprint={2407.10953},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2407.10953}, 
}