masanorihirano commited on
Commit
aaa6794
1 Parent(s): 5c08577

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +114 -5
README.md CHANGED
@@ -1,5 +1,114 @@
1
- ---
2
- license: other
3
- license_name: tongyi-qianwen-license
4
- license_link: LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: tongyi-qianwen-license
4
+ license_link: LICENSE
5
+ language:
6
+ - en
7
+ - ja
8
+ library_name: transformers
9
+ pipeline_tag: text-generation
10
+ ---
11
+
12
+ # nekomata-7b-pfn-qfin
13
+
14
+ ## Model Description
15
+ nekomata-7b-pfn-qfin is a fine-tuned model based on [rinna/nekomata-7b](https://huggingface.co/rinna/nekomata-7b/tree/main).
16
+ This is the base model, which is good at generating continuous sentences for finance.
17
+ nekomata-7b-pfn-qfin is fine-tuned on 370M tokens from multiple special datasets generated by Preferred Networks, which is clear to use for commercial usage.
18
+ The fine-tuned were carried out at a 2048 context length.
19
+ This model is released under [Tongyi Qianwen LICENSE AGREEMENT](https://github.com/QwenLM/Qwen/blob/e8e15962d897714944773cca57fa2e460a3655e8/Tongyi%20Qianwen%20LICENSE%20AGREEMENT).
20
+
21
+ The research article is available on [arXiv](https://arxiv.org/abs/2404.10555).
22
+
23
+ # Benchmarking
24
+ The benchmark score is obtained using [Japanese Language Model Financial Evaluation Harness](https://github.com/pfnet-research/japanese-lm-fin-harness)
25
+ For the benchmark, 0-shot and default prompts are used.
26
+ ```
27
+ | Task |Metric| nekomaba-7b | Ours |
28
+ |----------------|------|------|---|------|------|---|------|
29
+ |chabsa |f1 |0.8134| | |0.8127| | |
30
+ |cma_basics |acc |0.3158|± |0.0764|0.3684|± |0.0793|
31
+ |cpa_audit |acc |0.2085|± |0.0203|0.1809|± |0.0193|
32
+ |fp2 |acc |0.2484|± |0.0198|0.2674|± |0.0203|
33
+ |security_sales_1|acc |0.4912|± |0.0668|0.5088|± |0.0668|
34
+ |----------------|------|------|---|------|------|---|------|
35
+ |OVER ALL | |0.4155 |0.4276 |
36
+ ```
37
+ ## Usage
38
+ Install the required libraries as follows:
39
+ ```sh
40
+ >>> python -m pip install numpy sentencepiece torch transformers accelerate transformers_stream_generator tiktoken einops
41
+ ```
42
+
43
+ Execute the following python code:
44
+ ```python
45
+ import torch
46
+ from transformers import AutoTokenizer, AutoModelForCausalLM
47
+
48
+ tokenizer = AutoTokenizer.from_pretrained("pfnet/nekomata-7b-pfn-qfin", trust_remote_code=True)
49
+
50
+ # Use GPU with bf16 (recommended for supported devices)
51
+ # model = AutoModelForCausalLM.from_pretrained("pfnet/nekomata-7b-pfn-qfin", device_map="auto", trust_remote_code=True, bf16=True)
52
+
53
+ # Use GPU with fp16
54
+ # model = AutoModelForCausalLM.from_pretrained("pfnet/nekomata-7b-pfn-qfin", device_map="auto", trust_remote_code=True, fp16=True)
55
+
56
+ # Use GPU with fp32
57
+ # model = AutoModelForCausalLM.from_pretrained("pfnet/nekomata-7b-pfn-qfin", device_map="auto", trust_remote_code=True, fp32=True)
58
+
59
+ # Use CPU
60
+ # model = AutoModelForCausalLM.from_pretrained("pfnet/nekomata-7b-pfn-qfin", device_map="cpu", trust_remote_code=True)
61
+
62
+ # Automatically select device and precision
63
+ model = AutoModelForCausalLM.from_pretrained("pfnet/nekomata-7b-pfn-qfin", device_map="auto", trust_remote_code=True)
64
+
65
+ text = "日本銀行は"
66
+ input_ids = tokenizer(text, return_tensors="pt").input_ids.to(model.device)
67
+ with torch.no_grad():
68
+ generated_tokens = model.generate(
69
+ inputs=input_ids,
70
+ max_new_tokens=32,
71
+ do_sample=True,
72
+ temperature=1.0,
73
+ repetition_penalty=1.1
74
+ )[0]
75
+ generated_text = tokenizer.decode(generated_tokens)
76
+ print(generated_text)
77
+ # 日本銀行は、2016年9月に「長短金利操作付き量的・質的金融緩和」を導入し、長期国
78
+ ```
79
+
80
+ ## Model Details
81
+ - Model size: 7b
82
+ - Fine-tuned tokens: 370M tokens (Japanese: 300M tokens, English: 13M tokens, Digits: 14M tokens)
83
+ - Context length: 2048
84
+ - Developed by: Preferred Networks, Inc
85
+ - Model type: Causal decoder-only
86
+ - Language(s): Japanese and English
87
+ - License: [Tongyi Qianwen LICENSE AGREEMENT](https://github.com/QwenLM/Qwen/blob/e8e15962d897714944773cca57fa2e460a3655e8/Tongyi%20Qianwen%20LICENSE%20AGREEMENT)
88
+
89
+ ## Bias, Risks, and Limitations
90
+ nekomata-7b-pfn-qfin is a new technology that carries risks with use.
91
+ Testing conducted to date has been in English and Japanese, and has not covered, nor could it cover all scenarios.
92
+ For these reasons, as with all LLMs, nekomata-7b-pfn-qfin’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts.
93
+ This model is not designed for legal, tax, investment, financial, or other advice.
94
+ Therefore, before deploying any applications of nekomata-7b-pfn-qfin, developers should perform safety testing and tuning tailored to their specific applications of the model.
95
+
96
+ ## How to cite
97
+ ```
98
+ @misc{hirano2024,
99
+ title={Construction of Domain-specified Japanese Large Language Model for Finance through Continual Pre-training},
100
+ author={Masanori Hirano and Kentaro Imajo},
101
+ year={2024},
102
+ eprint={2404.10555},
103
+ archivePrefix={arXiv},
104
+ primaryClass={cs.CL}
105
+ }
106
+ ```
107
+
108
+ ## Contributors
109
+ Preferred Networks, Inc.
110
+ - Masanori Hirano
111
+ - Kentaro Imajo
112
+
113
+ # License
114
+ [Tongyi Qianwen LICENSE AGREEMENT](https://github.com/QwenLM/Qwen/blob/e8e15962d897714944773cca57fa2e460a3655e8/Tongyi%20Qianwen%20LICENSE%20AGREEMENT)