PEFT
English
dylanalloy commited on
Commit
289337b
β€’
1 Parent(s): 41e6eee

πŸ“ feat(docs): README

Browse files
Files changed (1) hide show
  1. README.md +132 -3
README.md CHANGED
@@ -1,8 +1,137 @@
1
  ---
2
  library_name: peft
 
 
 
 
 
3
  ---
4
- ## Training procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
  The following `bitsandbytes` quantization config was used during training:
8
  - load_in_8bit: False
@@ -14,7 +143,7 @@ The following `bitsandbytes` quantization config was used during training:
14
  - bnb_4bit_quant_type: fp4
15
  - bnb_4bit_use_double_quant: True
16
  - bnb_4bit_compute_dtype: bfloat16
17
- ### Framework versions
18
 
 
19
 
20
- - PEFT 0.4.0.dev0
 
1
  ---
2
  library_name: peft
3
+ license: apache-2.0
4
+ datasets:
5
+ - dylanalloy/ehc-contrived-financial
6
+ language:
7
+ - en
8
  ---
9
+ # Everything Has Context
10
+
11
+ ### falcon-ehc-contrived-financial-7b
12
+
13
+
14
+ ## 🀷 Purpose
15
+
16
+ A finetuned adapter (base model: [falcon-7b-instruct](https://huggingface.co/tiiuae/falcon-7b-instruct)) that was engineered for Q/A, context-retrieval trained on [dylanalloy/ehc-contrived-financial](https://huggingface.co/datasets/dylanalloy/ehc-contrived-financial) dataset. Read more on the dataset page to understand a bit about how this repo could be used in future "chain-of-thought" research, and what this model cannot reasonably achieve given the contrived nature of the dataset's context.
17
+
18
+ ## 🧢 Explain
19
+
20
+ `falcon-7b-instruct` base model is high performing for the compute, QLoRA is great, `bitsandbytes` makes quantization easy, `PEFT` makes the QLoRA training method easy.
21
+
22
+ The whole thing can be trained on a single 12GB VRAM NVIDIA card released since 2020 in about 2 hours.
23
+
24
+ ## πŸ‹οΈ Training
25
+
26
+ Finetuning with QLoRA is heavily documented, but the training here was done using the following parameters:
27
+
28
+ ``` python
29
+ training_args = TrainingArguments(
30
+ per_device_train_batch_size=1
31
+ , gradient_accumulation_steps=4
32
+ , num_train_epochs=20
33
+ , learning_rate=2e-4
34
+ , fp16=True
35
+ , save_total_limit=3
36
+ , logging_steps=13
37
+ , output_dir=OUTPUT_DIR
38
+ , max_steps=500
39
+ , optim="paged_adamw_8bit"
40
+ , lr_scheduler_type="cosine"
41
+ , warmup_ratio=0.05
42
+ )
43
+ ```
44
+
45
+ ## ⌨️ Usage
46
+
47
+ `PEFT` is important in this implementation. So is `bitsandbytes`. If you do not know how to use them, their documentation is excellent.
48
+
49
+ ``` python
50
+ from peft import (
51
+ LoraConfig
52
+ , PeftConfig
53
+ , PeftModel
54
+ )
55
+ from transformers import (
56
+ AutoModelForCausalLM
57
+ , AutoTokenizer
58
+ , BitsAndBytesConfig
59
+ )
60
+ import torch
61
+
62
+ PEFT_MODEL = "dylanalloy/ehc-contrived-financial"
63
+
64
+ config = PeftConfig.from_pretrained(PEFT_MODEL)
65
+
66
+ bb_config = BitsAndBytesConfig(
67
+ load_in_4bit=True
68
+ , bnb_4bit_use_double_quant=True
69
+ , bb_4bit_quant_type="nf4"
70
+ , bnb_4bit_compute_dtype=torch.bfloat16
71
+ )
72
 
73
+ model = AutoModelForCausalLM.from_pretrained(
74
+ config.base_model_name_or_path
75
+ , return_dict=True
76
+ , quantization_config=bb_config
77
+ , device_map="auto"
78
+ , trust_remote_code=True
79
+ )
80
+ tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
81
+ tokenizer.pad_token = tokenizer.eos_token
82
+
83
+ model = PeftModel.from_pretrained(model, PEFT_MODEL)
84
+
85
+ generation_config = model.generation_config
86
+ generation_config.max_new_tokens = 200
87
+ generation_config.temperature = 0.7
88
+ generation_config.top_p = 0.7
89
+ generation_config.num_return_sequences = 1
90
+ generation_config.pad_token_id = tokenizer.eos_token_id
91
+ generation_config.eos_token_id = tokenizer.eos_token_id
92
+
93
+ DEVICE = "cuda:0"
94
+
95
+ def generate_response(question: str, context: str) -> str:
96
+ prompt = f"""QUESTION: {question}
97
+ CONTEXT:
98
+ {context}
99
+ FOLLOWUP:
100
+ """.strip()
101
+ encoding = tokenizer(prompt, return_tensors='pt').to(DEVICE)
102
+ with torch.inference_mode():
103
+ outputs = model.generate(
104
+ input_ids=encoding.input_ids
105
+ , attention_mask=encoding.attention_mask
106
+ , generation_config=generation_config
107
+ )
108
+ return tokenizer.decode(outputs[0], skip_special_tokens=True).split("FOLLOWUP: ")[1]
109
+
110
+ # starting the engineer off with a real bit of context from an SEC filing with a naive question posed.
111
+ # the same question was used to retrieve the context from a vector database initially
112
+ answer = generate_response(
113
+ """What are the potential risks for Bank of America?"""
114
+ , """We believe that these factors include, but are not limited to, the following: Insurance Risk &#8226; the cyclical nature of the insurance and reinsurance business leading to periods with excess underwriting capacity and unfavorable premium rates; &#8226; the occurrence and magnitude of natural and man-made disasters, including the potential increase of our exposure to natural catastrophe losses due to climate change and the potential for inherently unpredictable losses from man-made catastrophes, such as cyber-attacks.; &#8226; the effects of emerging claims, systemic risks, and coverage and regulatory issues, including increasing litigation and uncertainty related to coverage definitions, limits, terms and conditions; &#8226; actual claims exceeding reserves for losses and loss expenses; &#8226; the adverse impact of inflation; &#8226; the failure of any of the loss limitation methods we employ; &#8226; the failure of our cedants to adequately evaluate risks; Strategic Risk &#8226; losses from war including losses related to the Russian invasion of Ukraine, terrorism and political unrest, or other unanticipated losses; &#8226; changes in the political environment of certain countries in which we operate or underwrite business, including the United Kingdom's withdrawal from the European Union; &#8226; the loss of business provided to us by major brokers; &#8226; a decline in our ratings with rating agencies; &#8226; the loss of one or more of our key executives; &#8226; difficulties with technology and/or data security; &#8226; increasing scrutiny and evolving expectations from investors, customers, regulators, policymakers and other stakeholders regarding environmental, social and governance matters; COVID-19 &#8226; the adverse impact of the ongoing COVID-19 pandemic on our business, results of operations, financial condition, and liquidity; Credit and Market Risk &#8226; the inability to purchase reinsurance or collect amounts due to us from reinsurance we have purchased; &#8226; the failure of our policyholders or intermediaries to pay premiums; &#8226; general economic, capital and credit market conditions, including banking sector instability, financial market illiquidity and fluctuations in interest rates, credit spreads, equity securities' prices, and/or foreign currency exchange rates; &#8226; breaches by third parties in our program business of their obligations to us; Liquidity Risk &#8226; the inability to access sufficient cash to meet our obligations when they are due; Operational Risk &#8226; changes in accounting policies or practices; &#8226; the use of industry models and changes to these models; &#8226; difficulties with technology and/or data security; Regulatory Risk &#8226; changes in governmental regulations and potential government intervention in our industry; &#8226; inadvertent failure to comply with certain laws and regulations relating to sanctions and foreign corrupt practices; data protection and privacy; and Risks Related to Taxation &#8226; changes in tax laws; <|endoftext|>"""
115
+ )
116
+
117
+ ## your to-do:
118
+ ## process & chunk the responses from your source of context (usually a vector db) & loop into generating longer pieces until the '[ANSWER]:' is created by this adapter model
119
+ ## without your intervention, [FOLLOWUP]: and [CONTEXT]: will be hallucinated and will be derived from mostly undesirable model knowledge
120
+
121
+ ## this will not do you much good because it will use base model knowledge to continue its own research
122
+ # print("FOLLOWUP: "+answer)
123
+ ## but this will get you started with a context flow where you can inject information and generate further until an answer is found
124
+ print("[FOLLOWUP]: "+answer.split('CONTEXT:')[0])
125
+ >> [FOLLOWUP]: What steps has Bank of America taken to mitigate these risks?
126
+ ```
127
+
128
+ ## πŸ€– Generated Modelcard
129
+
130
+ ---
131
+ library_name: peft
132
+ ---
133
+
134
+ ## Training procedure
135
 
136
  The following `bitsandbytes` quantization config was used during training:
137
  - load_in_8bit: False
 
143
  - bnb_4bit_quant_type: fp4
144
  - bnb_4bit_use_double_quant: True
145
  - bnb_4bit_compute_dtype: bfloat16
 
146
 
147
+ ### Framework versions
148
 
149
+ - PEFT 0.4.0.dev0