GECToR ONNX Runtime — `gector-roberta-base-5k`

Lightweight ONNX export of the GECToR grammatical error correction model.

What's inside

File	Purpose
`model.onnx`	ONNX Runtime model (detection + tag classification heads)
`config.json`	Model configuration (label mappings, thresholds)
`tokenizer.json`	Fast tokenizer (no `transformers` dependency at runtime)
`verb-form-vocab.txt`	Verb conjugation lookup for `$TRANSFORM_VERB_*` tags
`inference.py`	Standalone inference script (see Usage below)

Runtime dependencies (~80–150 MB total)

pip install onnxruntime tokenizers numpy

Note: torch and transformers are not required at runtime. They are only needed to export the model (see scripts/export_onnx.py in the source repo).

Quick usage

1. Python API

from gector.onnx_predict import GECToRONNXPredictor

predictor = GECToRONNXPredictor(
    model_dir="letheviet/gector-roberta-base-5k-onnx",  # or local path
    verb_file="verb-form-vocab.txt",
    keep_confidence=0.0,
    min_error_prob=0.0,
    batch_size=128,
    n_iteration=5,
)

corrected = predictor.predict(["I has a apple .", "She go to school ."])
print(corrected)
# ['I have an apple .', 'She goes to school .']

2. Standalone script

Download inference.py from this repo and run:

# Single sentence
python inference.py \
    --model_dir letheviet/gector-roberta-base-5k-onnx \
    --input "I has a apple ."

# Batch from file
python inference.py \
    --model_dir letheviet/gector-roberta-base-5k-onnx \
    --input_file sentences.txt \
    --batch_size 64

3. PyQt6 / desktop app integration

Because the runtime has zero torch/transformers dependencies, bundling with PyInstaller is straightforward:

pip install onnxruntime tokenizers numpy
# PyInstaller spec only needs the 4 files above (~1.5 MB + deps)

Performance (CPU)

Backend	Latency / sentence	Throughput
PyTorch (1 thread)	~282 ms	~3.5 sentences/sec
ONNX Runtime	~101 ms	~9.9 sentences/sec

Benchmarked on the original gotutiyan/gector-roberta-base-5k checkpoint with 20 iterations.

Source

Original PyTorch implementation: gotutiyan/gector-roberta-base-5k
ONNX export scripts & benchmark: gector repo

Downloads last month: 5

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

GECToR ONNX Runtime — gector-roberta-base-5k