Instructions to use synterr-nlp/bea2026-gec-adapters with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use synterr-nlp/bea2026-gec-adapters with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
BEA 2026 β Russian GEC LoRA Adapters (SyntErr)
LoRA adapters from "What Aggregate Scores Hide: Per-Rule Evaluation of Russian Grammatical Error Correction" (BEA 2026). One repository, 29 adapters across 8 open base models Γ up to 4 training regimes, for Russian grammatical error correction.
- π Training data:
synterr-nlp/synterr-v4-sft - π§ Generator: github.com/synterr-nlp/synterr
- π Paper & artifacts: https://synterr-nlp.github.io/papers/bea-2026/
Training regimes
| folder pattern | regime | training data |
|---|---|---|
v4_<model> |
SyntErr-only | 39,209 synthetic examples (SyntErr v4) |
v4_<model>_lorugec |
SyntErr β LORuGEC | SyntErr-only, then continued on 348 real LORuGEC val examples |
v4_lorugec_only_<model> |
LORuGEC-only | 348 real LORuGEC val examples only |
v4_clean_only_<model> |
clean control | clean text (src = tgt) only |
Results β LORuGEC test, M2 Fβ.β
| Base model | folder stem | Zero-shot | +lorugec | +synterr | +sβlorugec |
|---|---|---|---|---|---|
| Qwen3.5-0.8B | qwen35_08b |
13.7 | 33.8 | 45.2 | 54.0 |
| Qwen3.5-4B | qwen35_4b |
47.6 | 54.7 | 67.0 | 75.3 |
| Qwen3.5-9B | qwen35_9b |
49.2 | 61.2 | 57.3 | 70.9 |
| Qwen2.5-7B | qwen25_7b |
39.5 | 44.0 | 58.1 | 66.6 |
| Gemma-3-1B | gemma3_1b |
18.9 | 25.5 | 48.0 | 47.8 |
| Gemma-3-4B | gemma3_4b |
41.5 | 43.0 | 54.5 | 58.2 |
| Gemma-3-12B | gemma3_12b |
52.5 | 58.8 | 66.0 | 69.7 |
| Apertus-8B | apertus_8b |
48.7 | 54.1 | 69.4 | 72.4 |
Columns map to folders: +synterr β v4_<stem>, +sβlorugec β
v4_<stem>_lorugec, +lorugec β v4_lorugec_only_<stem>. clean control
adapters (v4_clean_only_qwen25_7b, v4_clean_only_qwen35_08b) are reported in
the paper appendix. GigaChat-3.1-lightning adapters (v4_gc31_lightning*) are
included as supplementary runs.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_id = "Qwen/Qwen3.5-4B"
base = AutoModelForCausalLM.from_pretrained(base_id, device_map="auto")
tok = AutoTokenizer.from_pretrained(base_id)
# load one adapter by subfolder
model = PeftModel.from_pretrained(
base, "synterr-nlp/bea2026-gec-adapters", subfolder="v4_qwen35_4b_lorugec"
)
Inference prompt is the system prompt described in the paper (Β§ Experimental setup); see the generator repo for the exact template.
Licenses
These are LoRA adapters (weight deltas), not standalone models β using them requires the corresponding base model under its own license:
| Base family | License |
|---|---|
| Qwen2.5 / Qwen3.5 | Apache-2.0 |
| Gemma 3 | Gemma Terms of Use (Google) |
| Apertus-8B | Apache-2.0 |
| GigaChat-3.1 | GigaChat License (Sber) |
The adapter weights in this repo are released for research use; you remain responsible for complying with each base model's terms.
Citation
@inproceedings{smirnova2026aggregate,
title = {What Aggregate Scores Hide: Per-Rule Evaluation of Russian Grammatical Error Correction},
author = {Smirnova, Anna and Kopan, Artyom and Makeev, Vladislav and Chernishev, George},
booktitle = {Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA)},
year = {2026},
url = {https://synterr-nlp.github.io/papers/bea-2026/},
}
- Downloads last month
- -
Model tree for synterr-nlp/bea2026-gec-adapters
Base model
Qwen/Qwen2.5-7B