PEFT
English
File size: 13,714 Bytes
293acd9
41e6eee
289337b
 
 
 
 
293acd9
289337b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
077a451
289337b
 
 
 
 
 
 
 
 
41e6eee
289337b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52ff279
 
 
 
 
 
 
289337b
 
 
 
 
 
 
 
 
41e6eee
 
 
 
 
 
 
 
 
 
 
 
289337b
41e6eee
289337b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
---
library_name: peft
license: apache-2.0
datasets:
- dylanalloy/ehc-contrived-financial
language:
- en
---
# Everything Has Context

### falcon-ehc-contrived-financial-7b


## 🤷 Purpose

A finetuned adapter (base model: [falcon-7b-instruct](https://huggingface.co/tiiuae/falcon-7b-instruct)) that was engineered for Q/A, context-retrieval trained on [dylanalloy/ehc-contrived-financial](https://huggingface.co/datasets/dylanalloy/ehc-contrived-financial) dataset. Read more on the dataset page to understand a bit about how this repo could be used in future "chain-of-thought" research, and what this model cannot reasonably achieve given the contrived nature of the dataset's context.

## 🧶 Explain

`falcon-7b-instruct` base model is high performing for the compute, QLoRA is great, `bitsandbytes` makes quantization easy, `PEFT` makes the QLoRA training method easy.

The whole thing can be trained on a single 12GB VRAM NVIDIA card released since 2020 in about 2 hours. 

## 🏋️ Training

Finetuning with QLoRA is heavily documented, but the training here was done using the following parameters:

``` python
training_args = TrainingArguments(
    per_device_train_batch_size=1
    , gradient_accumulation_steps=4
    , num_train_epochs=20
    , learning_rate=2e-4
    , fp16=True
    , save_total_limit=3
    , logging_steps=13
    , output_dir=OUTPUT_DIR
    , max_steps=500
    , optim="paged_adamw_8bit"
    , lr_scheduler_type="cosine"
    , warmup_ratio=0.05
)
```

## ⌨️ Usage

`PEFT` is important in this implementation. So is `bitsandbytes`. If you do not know how to use them, their documentation is excellent. 

``` python
from peft import (
    LoraConfig
    , PeftConfig
    , PeftModel
)
from transformers import (
    AutoModelForCausalLM
    , AutoTokenizer
    , BitsAndBytesConfig
)
import torch

PEFT_MODEL = "dylanalloy/falcon-ehc-contrived-financial-7b"

config = PeftConfig.from_pretrained(PEFT_MODEL)

bb_config = BitsAndBytesConfig(
    load_in_4bit=True
    , bnb_4bit_use_double_quant=True
    , bb_4bit_quant_type="nf4"
    , bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(
    config.base_model_name_or_path
    , return_dict=True
    , quantization_config=bb_config
    , device_map="auto"
    , trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
tokenizer.pad_token = tokenizer.eos_token

model = PeftModel.from_pretrained(model, PEFT_MODEL)

generation_config = model.generation_config
generation_config.max_new_tokens = 200
generation_config.temperature = 0.7
generation_config.top_p = 0.7
generation_config.num_return_sequences = 1
generation_config.pad_token_id = tokenizer.eos_token_id
generation_config.eos_token_id = tokenizer.eos_token_id

DEVICE = "cuda:0"

def generate_response(question: str, context: str) -> str:
    prompt = f"""QUESTION: {question}
                CONTEXT:
                {context}
                FOLLOWUP:
                """.strip()
    encoding = tokenizer(prompt, return_tensors='pt').to(DEVICE)
    with torch.inference_mode():
        outputs = model.generate(
            input_ids=encoding.input_ids
            , attention_mask=encoding.attention_mask
            , generation_config=generation_config
        )
    return tokenizer.decode(outputs[0], skip_special_tokens=True).split("FOLLOWUP: ")[1]

# starting the engineer off with a real bit of context from an SEC filing with a naive question posed. 
# the same question was used to retrieve the context from a vector database initially
answer = generate_response(
    """What are the potential risks for Bank of America?"""
    , """We believe that these factors include, but are not limited to, the following: Insurance Risk  &#8226; the cyclical nature of the insurance and reinsurance business leading to periods with excess underwriting  capacity and unfavorable premium rates; &#8226; the occurrence and magnitude of natural and man-made disasters, including the potential increase of our  exposure to natural catastrophe losses due to climate change and the potential for inherently  unpredictable losses from man-made catastrophes, such as cyber-attacks.; &#8226; the effects of emerging claims, systemic risks, and coverage and regulatory issues, including increasing  litigation and uncertainty related to coverage definitions, limits, terms and conditions; &#8226; actual claims exceeding reserves for losses and loss expenses; &#8226; the adverse impact of inflation; &#8226; the failure of any of the loss limitation methods we employ; &#8226; the failure of our cedants to adequately evaluate risks; Strategic Risk &#8226; losses from war including losses related to the Russian invasion of Ukraine, terrorism and political unrest, or  other unanticipated losses; &#8226; changes in the political environment of certain countries in which we operate or underwrite business,  including the United Kingdom's withdrawal from the European Union; &#8226; the loss of business provided to us by major brokers; &#8226; a decline in our ratings with rating agencies; &#8226; the loss of one or more of our key executives; &#8226; difficulties with technology and/or data security; &#8226; increasing scrutiny and evolving expectations from investors, customers, regulators, policymakers and other  stakeholders regarding environmental, social and governance matters; COVID-19 &#8226; the adverse impact of the ongoing COVID-19 pandemic on our business, results of operations, financial  condition, and liquidity; Credit and Market Risk &#8226; the inability to purchase reinsurance or collect amounts due to us from reinsurance we have purchased; &#8226; the failure of our policyholders or intermediaries to pay premiums; &#8226; general economic, capital and credit market conditions, including banking sector instability, financial market  illiquidity and fluctuations in interest rates, credit spreads, equity securities' prices, and/or foreign currency  exchange rates; &#8226; breaches by third parties in our program business of their obligations to us; Liquidity Risk &#8226; the inability to access sufficient cash to meet our obligations when they are due; Operational Risk &#8226; changes in accounting policies or practices; &#8226; the use of industry models and changes to these models; &#8226; difficulties with technology and/or data security; Regulatory Risk &#8226; changes in governmental regulations and potential government intervention in our industry; &#8226; inadvertent failure to comply with certain laws and regulations relating to sanctions and foreign corrupt  practices; data protection and privacy; and Risks Related to Taxation &#8226; changes in tax laws; <|endoftext|>"""
)

## your to-do:
## process & chunk the responses from your source of context (usually a vector db) & loop into generating longer pieces until the '[ANSWER]:' is created by this adapter model
## without your intervention, [FOLLOWUP]: and [CONTEXT]: will be hallucinated and will be derived from mostly undesirable model knowledge

## this will not do you much good because it will use base model knowledge to continue its own research
# print("FOLLOWUP: "+answer)
## but this will get you started with a context flow where you can inject information and generate further until an answer is found
print("[FOLLOWUP]: "+answer.split('CONTEXT:')[0])
>> [FOLLOWUP]: What steps has Bank of America taken to mitigate these risks?
print(answer)
>> [QUESTION]: What steps has Bank of America taken to mitigate these risks?
[CONTEXT]: We believe that these factors include, but are not limited to, the following: Insurance Risk  &#8226; the cyclical nature of the insurance and reinsurance business leading to periods with excess underwriting  capacity and unfavorable premium rates; &#8226; the occurrence and magnitude of natural and man-made disasters, including the potential increase of our  exposure to natural catastrophe losses due to climate change and the potential for inherently  unpredictable losses from man-made catastrophes, such as cyber-attacks.; &#8226; the effects of emerging claims, systemic risks, and coverage and regulatory issues, including increasing  litigation and uncertainty related to coverage definitions, limits, terms and conditions; &#8226; actual claims exceeding reserves for losses and loss expenses; &#8226; the adverse impact of inflation; &#8226; the failure of any of the loss limitation methods we employ; &#8226; the failure of our cedants to adequately evaluate risks; Strategic Risk &#8226; losses from war including losses related to the Russian invasion of Ukraine, terrorism and political unrest, or  other unanticipated losses; &#8226; changes in the political environment of certain countries in which we operate or underwrite business,  including the United Kingdom's withdrawal from the European Union; &#8226; the loss of business provided to us by major brokers; &#8226; a decline in our ratings with rating agencies; &#8226; the loss of one or more of our key executives; &#8226; difficulties with technology and/or data security; &#8226; increasing scrutiny and evolving expectations from investors, customers, regulators, policymakers and other  stakeholders regarding environmental, social and governance matters; COVID-19 &#8226; the adverse impact of the ongoing COVID-19 pandemic on our business, results of operations, financial  condition, and liquidity; Credit and Market Risk &#8226; the inability to purchase reinsurance or collect amounts due to us from reinsurance we have purchased; &#8226; the failure of our policyholders or intermediaries to pay premiums; &#8226; general economic, capital and credit market conditions, including banking sector instability, financial market  illiquidity and fluctuations in interest rates, credit spreads, equity securities' prices, and/or foreign currency  exchange rates; &#8226; breaches by third parties in our program business of their obligations to us; Liquidity Risk &#8226; the inability to access sufficient cash to meet our obligations when they are due; Operational Risk &#8226; changes in accounting policies or practices; &#8226; the use of industry models and changes to these models; &#8226; difficulties with technology and/or data security; Regulatory Risk &#8226; changes in governmental regulations and potential government intervention in our industry; &#8226; inadvertent failure to comply with certain laws and regulations relating to sanctions and foreign corrupt  practices; data protection and privacy; and Risks Related to Taxation &#8226; changes in tax laws; 
[FOLLOWUP]: What steps has Bank of America taken to address these factors?
[CONTEXT]: Bank of America has implemented various measures to address these factors. For example: &#8226; We have implemented a comprehensive risk management framework that includes risk identification risk assessment risk mitigation and risk monitoring. &#8226; We have implemented advanced data analytics and predictive modeling techniques to better understand and anticipate potential risks. &#8226; We have enhanced our risk management processes to ensure timely identification and mitigation of risks. &#8226; We have implemented a robust risk management structure that includes regular risk assessments and monitoring of key risk indicators. &#8226; We have established a dedicated risk management team to oversee the implementation of risk mitigation strategies. &#8226; We have implemented a comprehensive cyber security program to protect against potential cyber threats. &#8226; We have implemented a comprehensive environmental risk management program to address environmental risks. &#8226; We have implemented a comprehensive risk management program to address operational risks. &#8226; We have implemented a comprehensive risk management program to address liquidity risks. &#8226; We have implemented a comprehensive risk management program to address regulatory risks. &#8226; We have implemented a comprehensive risk management program to address tax-related risks. [FOLLOWUP]: Are there any specific initiatives or projects that Bank of America has undertaken to address these factors?
[CONTEXT]: Yes Bank of America has undertaken several initiatives and projects to address these factors. For example: &#8226; We have implemented a comprehensive risk management program that includes risk assessments and mitigation strategies. &#8226; We have implemented a comprehensive cyber security program to protect against potential cyber threats. &#8226; We have implemented a comprehensive environmental risk management program to address environmental risks. &#8226; We have implemented a comprehensive risk management program to address operational risks. &#8226; We have implemented a comprehensive risk management program to address liquidity risks. &#8226; We have implemented a comprehensive risk management program to address regulatory risks. [FOLLOWUP]: Are there any other measures Bank of America has taken to address these factors?
[CONTEXT]: Yes Bank of America has taken additional measures to address these factors. For example: &#8226; We have implemented a comprehensive risk management program th
```

## 🤖 Generated Modelcard

---
library_name: peft
---

## Training procedure

The following `bitsandbytes` quantization config was used during training:
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: bfloat16

### Framework versions

- PEFT 0.4.0.dev0