Domain-Specific IE Adapter β€” Gemma 3 27B (short instruction)

LoRA adapter for google/gemma-3-27b-it fine-tuned to extract compensation-consultant mentions from SEC proxy statements (DEF 14A), classifying each firm as:

  • RET β€” consultant retained/engaged as a compensation advisor
  • SURV β€” survey-only data provider (not retained as an advisor)

Companion artifact for the anonymous submission "From Lengthy Narrative to Structured Data: Instruction Fine-Tuning Open-Weight LLMs for Information Extraction from Corporate Disclosures."

This adapter

Base model google/gemma-3-27b-it
Method LoRA (r=8, Ξ±=16), 4-bit QLoRA
Instruction format minimal (short)
Instance-level F1 96.1%

Each adapter is trained for one instruction variant β€” pair this adapter with the short prompt at inference.

Adapter family (same task, 2,001-sample training set)

Adapter Base Instruction F1
domain-specific-adapter Gemma 3 27B detailed (long) 95.9%
domain-specific-adapter-short Gemma 3 27B minimal (short) 96.1%
domain-specific-12b-adapter Gemma 3 12B detailed (long) 95.7%
domain-specific-12b-adapter-short Gemma 3 12B minimal (short) 93.0%

Evaluated on 316 consultants across 143 company-years from 84 SEC filings.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = "google/gemma-3-27b-it"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, device_map="auto", load_in_4bit=True)
model = PeftModel.from_pretrained(model, "cs-file-uploads/domain-specific-adapter-short")

See the code repository for the full inference pipeline (retrieval β†’ chunking β†’ extraction β†’ grounding validation β†’ cross-chunk aggregation) and the exact prompt templates.

Output format

{RET: 'Pearl Meyer & Partners, LLC'}, {SURV: 'Mercer', 'Radford'}

Training

2,001 human-labeled and augmented proxy-statement excerpts; LR 2e-4 (cosine, 3% warmup); max sequence length 5,120; 3 epochs; 20% validation split.

License

Derived from Google Gemma 3; use is subject to the Gemma Terms of Use. Adapter weights are released for research use.

Citation

@misc{anonymous2026fromlengthy,
  title={From Lengthy Narrative to Structured Data: Instruction Fine-Tuning Open-Weight LLMs for Information Extraction from Corporate Disclosures},
  author={Anonymous},
  year={2026},
  note={Under review}
}
Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for cs-file-uploads/domain-specific-adapter-short

Adapter
(264)
this model