Phi-2 QLoRA Fine-tuned β€” Earnings Call Summarizer

A QLoRA fine-tuned adapter on top of Microsoft's Phi-2 (2.7B) for summarizing financial earnings call excerpts into structured highlights.

Model Details

  • Developed by: Sanjay R K
  • Base model: microsoft/phi-2
  • Model type: Causal LM with LoRA adapter
  • Language: English
  • License: Apache 2.0
  • Fine-tuning method: QLoRA (NF4 4-bit quantization + LoRA)

Training Details

  • Dataset: gbharti/finance-alpaca (1,000 samples)
  • Epochs: 3
  • Learning rate: 2e-4
  • LoRA rank (r): 8
  • LoRA alpha: 16
  • Target modules: q_proj, k_proj, v_proj, dense
  • Trainable parameters: 5,242,880 / 2,784,926,720 (0.19%)
  • Hardware: NVIDIA Tesla T4 (Google Colab)
  • Training time: ~29 minutes

Training Loss

Epoch Training Loss Validation Loss
1 1.9123 1.7098
2 1.7530 1.6598
3 1.6555 1.6539

Evaluation Results

Evaluated on a held-out earnings call excerpt against a reference summary.

Metric Base Phi-2 Fine-tuned
ROUGE-1 β€” 0.7368
ROUGE-2 β€” 0.5161
ROUGE-L β€” 0.6316

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.float16
)

tokenizer = AutoTokenizer.from_pretrained("sanju2007/phi2-earnings-summarizer-qlora")

base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/phi-2",
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

model = PeftModel.from_pretrained(base_model, "sanju2007/phi2-earnings-summarizer-qlora")
model.eval()

prompt = """### Instruction:
Summarize the key financial highlights from the following earnings call excerpt.

### Input:
Your earnings call text here.

### Response:"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=200,
                             temperature=0.7, do_sample=True,
                             pad_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

  • Trained on only 1,000 samples β€” not production ready
  • Performance may degrade on non-English earnings calls
  • Based on Phi-2 which has a 2048 token context limit
  • Not suitable for financial advice or decision making

Environmental Impact

  • Hardware: NVIDIA Tesla T4
  • Cloud provider: Google Colab
  • Training time: ~29 minutes
Downloads last month
38
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for sanju2007/phi2-earnings-summarizer-qlora

Base model

microsoft/phi-2
Adapter
(989)
this model