T5 Tiny GEC — Grammar Error Correction

Tiny grammar correction model exported to ONNX for Transformers.js v3. Based on visheratin/t5-efficient-tiny-grammar-correction.

4 decoder layers, d_model=256, num_heads=4
Encoder: 43MB FP32 / 11MB INT8
Decoder: 79MB FP32 / 20MB INT8
Runs in-browser or Node.js via ONNX Runtime

Usage

import { pipeline } from '@huggingface/transformers';

const corrector = await pipeline('text2text-generation', 'rabden/t5-tiny-gec-hone', {
  quantized: true,
  dtype: 'q8',
});

const result = await corrector('he go to school yesterday', {
  max_new_tokens: 64,
  temperature: 0,
  do_sample: false,
});
// "He went to school yesterday."

Performance

Measured on CPU (Node.js, ONNX Runtime, INT8 quantized):

Input	Output	Time
he go to school yesterday	He went to school yesterday.	~115ms
she dont like apples	She doesn't like apples.	~47ms
i have went to the store	I have been to the store.	~40ms
they was running fast	They were running fast.	~30ms

First-load from HuggingFace Hub: ~18s / Cached load: ~0.7s

Files

File	Size
`onnx/encoder_model.onnx`	43.5 MB
`onnx/encoder_model_quantized.onnx`	11.0 MB
`onnx/decoder_model_merged.onnx`	78.9 MB
`onnx/decoder_model_merged_quantized.onnx`	19.9 MB
`config.json`	—
`tokenizer.json`	2.3 MB

Notes

The ONNX export uses a custom NonGrowingCache for cross-attention to prevent KV cache doubling across decoder steps. Standard DynamicCache concatenates past and new encoder keys, causing a shape mismatch in cross-attention position bias on the second decoder step.

Downloads last month: 101

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support