ByFinTech commited on
Commit
c914a4d
·
1 Parent(s): 6d94330

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md CHANGED
@@ -1,3 +1,83 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+
5
+ # FinGPT sentiment analysis task
6
+
7
+ ## Model info
8
+ - Base model: InternLM-20B
9
+ - Training method: Instruction Fine-tuning + LoRA
10
+ - Task: Sentiment Analysis
11
+
12
+ ## Packages
13
+ ``` python
14
+ !pip install transformers==4.32.0 peft==0.5.0
15
+ !pip install sentencepiece
16
+ !pip install accelerate
17
+ !pip install torch
18
+ !pip install peft
19
+ !pip install datasets
20
+ !pip install bitsandbytes
21
+ ```
22
+
23
+ ## Inference: Try the model in Google Colab
24
+ ``` python
25
+ from transformers import AutoModel, AutoTokenizer, AutoModelForCausalLM, LlamaForCausalLM, LlamaTokenizerFast
26
+ from peft import PeftModel # 0.5.0
27
+
28
+ # Load Models
29
+ base_model = "internlm/internlm-20b"
30
+ peft_model = "FinGPT/fingpt-sentiment_internlm-20b_lora"
31
+ tokenizer = LlamaTokenizerFast.from_pretrained(base_model, trust_remote_code=True)
32
+ tokenizer.pad_token = tokenizer.eos_token
33
+ model = LlamaForCausalLM.from_pretrained(base_model, trust_remote_code=True, device_map = "cuda:0", load_in_8bit = True,)
34
+ model = PeftModel.from_pretrained(model, peft_model)
35
+ model = model.eval()
36
+
37
+ # Make prompts
38
+ prompt = [
39
+ '''Instruction: What is the sentiment of this news? Please choose an answer from {negative/neutral/positive}
40
+ Input: FINANCING OF ASPOCOMP 'S GROWTH Aspocomp is aggressively pursuing its growth strategy by increasingly focusing on technologically more demanding HDI printed circuit boards PCBs .
41
+ Answer: ''',
42
+ '''Instruction: What is the sentiment of this news? Please choose an answer from {negative/neutral/positive}
43
+ Input: According to Gran , the company has no plans to move all production to Russia , although that is where the company is growing .
44
+ Answer: ''',
45
+ '''Instruction: What is the sentiment of this news? Please choose an answer from {negative/neutral/positive}
46
+ Input: A tinyurl link takes users to a scamming site promising that users can earn thousands of dollars by becoming a Google ( NASDAQ : GOOG ) Cash advertiser .
47
+ Answer: ''',
48
+ ]
49
+
50
+ # Generate results
51
+ tokens = tokenizer(prompt, return_tensors='pt', padding=True, max_length=512)
52
+ res = model.generate(**tokens, max_length=512)
53
+ res_sentences = [tokenizer.decode(i) for i in res]
54
+ out_text = [o.split("Answer: ")[1] for o in res_sentences]
55
+
56
+ # show results
57
+ for sentiment in out_text:
58
+ print(sentiment)
59
+
60
+ # Output:
61
+ # positive
62
+ # neutral
63
+ # negative
64
+ ```
65
+
66
+ ## Training Script: [Our Code](https://github.com/AI4Finance-Foundation/FinGPT/tree/master/fingpt/FinGPT_Benchmark)
67
+ ```
68
+ #internlm-20b
69
+ deepspeed -i "localhost:2" train_lora.py
70
+ --run_name sentiment-internlm-20b-8epochs-lr2e-4-linear
71
+ --base_model internlm-20b
72
+ --dataset data/fingpt-sentiment-train
73
+ --max_length 512
74
+ --batch_size 8
75
+ --learning_rate 2e-4
76
+ --num_epochs 8
77
+ > train_internlm-20b_1gpu_8epochs_lr2e4_bs8_fp16_linear.log 2>&1
78
+ ```
79
+
80
+ ## Training Data:
81
+
82
+ * https://huggingface.co/datasets/FinGPT/fingpt-sentiment-train
83
+ - PEFT 0.5.0