PEFT
English
dylanalloy's picture
πŸ“ feat(docs): README
077a451
|
raw
history blame
7.95 kB
metadata
library_name: peft
license: apache-2.0
datasets:
  - dylanalloy/ehc-contrived-financial
language:
  - en

Everything Has Context

falcon-ehc-contrived-financial-7b

🀷 Purpose

A finetuned adapter (base model: falcon-7b-instruct) that was engineered for Q/A, context-retrieval trained on dylanalloy/ehc-contrived-financial dataset. Read more on the dataset page to understand a bit about how this repo could be used in future "chain-of-thought" research, and what this model cannot reasonably achieve given the contrived nature of the dataset's context.

🧢 Explain

falcon-7b-instruct base model is high performing for the compute, QLoRA is great, bitsandbytes makes quantization easy, PEFT makes the QLoRA training method easy.

The whole thing can be trained on a single 12GB VRAM NVIDIA card released since 2020 in about 2 hours.

πŸ‹οΈ Training

Finetuning with QLoRA is heavily documented, but the training here was done using the following parameters:

training_args = TrainingArguments(
    per_device_train_batch_size=1
    , gradient_accumulation_steps=4
    , num_train_epochs=20
    , learning_rate=2e-4
    , fp16=True
    , save_total_limit=3
    , logging_steps=13
    , output_dir=OUTPUT_DIR
    , max_steps=500
    , optim="paged_adamw_8bit"
    , lr_scheduler_type="cosine"
    , warmup_ratio=0.05
)

⌨️ Usage

PEFT is important in this implementation. So is bitsandbytes. If you do not know how to use them, their documentation is excellent.

from peft import (
    LoraConfig
    , PeftConfig
    , PeftModel
)
from transformers import (
    AutoModelForCausalLM
    , AutoTokenizer
    , BitsAndBytesConfig
)
import torch

PEFT_MODEL = "dylanalloy/falcon-ehc-contrived-financial-7b"

config = PeftConfig.from_pretrained(PEFT_MODEL)

bb_config = BitsAndBytesConfig(
    load_in_4bit=True
    , bnb_4bit_use_double_quant=True
    , bb_4bit_quant_type="nf4"
    , bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(
    config.base_model_name_or_path
    , return_dict=True
    , quantization_config=bb_config
    , device_map="auto"
    , trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
tokenizer.pad_token = tokenizer.eos_token

model = PeftModel.from_pretrained(model, PEFT_MODEL)

generation_config = model.generation_config
generation_config.max_new_tokens = 200
generation_config.temperature = 0.7
generation_config.top_p = 0.7
generation_config.num_return_sequences = 1
generation_config.pad_token_id = tokenizer.eos_token_id
generation_config.eos_token_id = tokenizer.eos_token_id

DEVICE = "cuda:0"

def generate_response(question: str, context: str) -> str:
    prompt = f"""QUESTION: {question}
                CONTEXT:
                {context}
                FOLLOWUP:
                """.strip()
    encoding = tokenizer(prompt, return_tensors='pt').to(DEVICE)
    with torch.inference_mode():
        outputs = model.generate(
            input_ids=encoding.input_ids
            , attention_mask=encoding.attention_mask
            , generation_config=generation_config
        )
    return tokenizer.decode(outputs[0], skip_special_tokens=True).split("FOLLOWUP: ")[1]

# starting the engineer off with a real bit of context from an SEC filing with a naive question posed. 
# the same question was used to retrieve the context from a vector database initially
answer = generate_response(
    """What are the potential risks for Bank of America?"""
    , """We believe that these factors include, but are not limited to, the following: Insurance Risk  &#8226; the cyclical nature of the insurance and reinsurance business leading to periods with excess underwriting  capacity and unfavorable premium rates; &#8226; the occurrence and magnitude of natural and man-made disasters, including the potential increase of our  exposure to natural catastrophe losses due to climate change and the potential for inherently  unpredictable losses from man-made catastrophes, such as cyber-attacks.; &#8226; the effects of emerging claims, systemic risks, and coverage and regulatory issues, including increasing  litigation and uncertainty related to coverage definitions, limits, terms and conditions; &#8226; actual claims exceeding reserves for losses and loss expenses; &#8226; the adverse impact of inflation; &#8226; the failure of any of the loss limitation methods we employ; &#8226; the failure of our cedants to adequately evaluate risks; Strategic Risk &#8226; losses from war including losses related to the Russian invasion of Ukraine, terrorism and political unrest, or  other unanticipated losses; &#8226; changes in the political environment of certain countries in which we operate or underwrite business,  including the United Kingdom's withdrawal from the European Union; &#8226; the loss of business provided to us by major brokers; &#8226; a decline in our ratings with rating agencies; &#8226; the loss of one or more of our key executives; &#8226; difficulties with technology and/or data security; &#8226; increasing scrutiny and evolving expectations from investors, customers, regulators, policymakers and other  stakeholders regarding environmental, social and governance matters; COVID-19 &#8226; the adverse impact of the ongoing COVID-19 pandemic on our business, results of operations, financial  condition, and liquidity; Credit and Market Risk &#8226; the inability to purchase reinsurance or collect amounts due to us from reinsurance we have purchased; &#8226; the failure of our policyholders or intermediaries to pay premiums; &#8226; general economic, capital and credit market conditions, including banking sector instability, financial market  illiquidity and fluctuations in interest rates, credit spreads, equity securities' prices, and/or foreign currency  exchange rates; &#8226; breaches by third parties in our program business of their obligations to us; Liquidity Risk &#8226; the inability to access sufficient cash to meet our obligations when they are due; Operational Risk &#8226; changes in accounting policies or practices; &#8226; the use of industry models and changes to these models; &#8226; difficulties with technology and/or data security; Regulatory Risk &#8226; changes in governmental regulations and potential government intervention in our industry; &#8226; inadvertent failure to comply with certain laws and regulations relating to sanctions and foreign corrupt  practices; data protection and privacy; and Risks Related to Taxation &#8226; changes in tax laws; <|endoftext|>"""
)

## your to-do:
## process & chunk the responses from your source of context (usually a vector db) & loop into generating longer pieces until the '[ANSWER]:' is created by this adapter model
## without your intervention, [FOLLOWUP]: and [CONTEXT]: will be hallucinated and will be derived from mostly undesirable model knowledge

## this will not do you much good because it will use base model knowledge to continue its own research
# print("FOLLOWUP: "+answer)
## but this will get you started with a context flow where you can inject information and generate further until an answer is found
print("[FOLLOWUP]: "+answer.split('CONTEXT:')[0])
>> [FOLLOWUP]: What steps has Bank of America taken to mitigate these risks?

πŸ€– Generated Modelcard


library_name: peft

Training procedure

The following bitsandbytes quantization config was used during training:

  • load_in_8bit: False
  • load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: fp4
  • bnb_4bit_use_double_quant: True
  • bnb_4bit_compute_dtype: bfloat16

Framework versions

  • PEFT 0.4.0.dev0