Instructions to use cs-file-uploads/domain-specific-12b-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use cs-file-uploads/domain-specific-12b-adapter with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/gemma-3-12b-it") model = PeftModel.from_pretrained(base_model, "cs-file-uploads/domain-specific-12b-adapter") - Notebooks
- Google Colab
- Kaggle
Domain-Specific IE Adapter β Gemma 3 12B (long instruction)
LoRA adapter for google/gemma-3-12b-it fine-tuned to extract compensation-consultant mentions from SEC proxy statements (DEF 14A), classifying each firm as:
- RET β consultant retained/engaged as a compensation advisor
- SURV β survey-only data provider (not retained as an advisor)
Companion artifact for the anonymous submission "From Lengthy Narrative to Structured Data: Instruction Fine-Tuning Open-Weight LLMs for Information Extraction from Corporate Disclosures."
This adapter
| Base model | google/gemma-3-12b-it |
| Method | LoRA (r=8, Ξ±=16), 4-bit QLoRA |
| Instruction format | detailed (long) |
| Instance-level F1 | 95.7% |
Each adapter is trained for one instruction variant β pair this adapter with the long prompt at inference.
Adapter family (same task, 2,001-sample training set)
| Adapter | Base | Instruction | F1 |
|---|---|---|---|
domain-specific-adapter |
Gemma 3 27B | detailed (long) | 95.9% |
domain-specific-adapter-short |
Gemma 3 27B | minimal (short) | 96.1% |
domain-specific-12b-adapter |
Gemma 3 12B | detailed (long) | 95.7% |
domain-specific-12b-adapter-short |
Gemma 3 12B | minimal (short) | 93.0% |
Evaluated on 316 consultants across 143 company-years from 84 SEC filings.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = "google/gemma-3-12b-it"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, device_map="auto", load_in_4bit=True)
model = PeftModel.from_pretrained(model, "cs-file-uploads/domain-specific-12b-adapter")
See the code repository for the full inference pipeline (retrieval β chunking β extraction β grounding validation β cross-chunk aggregation) and the exact prompt templates.
Output format
{RET: 'Pearl Meyer & Partners, LLC'}, {SURV: 'Mercer', 'Radford'}
Training
2,001 human-labeled and augmented proxy-statement excerpts; LR 2e-4 (cosine, 3% warmup); max sequence length 5,120; 3 epochs; 20% validation split.
License
Derived from Google Gemma 3; use is subject to the Gemma Terms of Use. Adapter weights are released for research use.
Citation
@misc{anonymous2026fromlengthy,
title={From Lengthy Narrative to Structured Data: Instruction Fine-Tuning Open-Weight LLMs for Information Extraction from Corporate Disclosures},
author={Anonymous},
year={2026},
note={Under review}
}
- Downloads last month
- 27