T5 Tiny GEC โ€” Grammar Error Correction

Tiny grammar correction model exported to ONNX for Transformers.js v3. Based on visheratin/t5-efficient-tiny-grammar-correction.

  • 4 decoder layers, d_model=256, num_heads=4
  • Encoder: 43MB FP32 / 11MB INT8
  • Decoder: 79MB FP32 / 20MB INT8
  • Runs in-browser or Node.js via ONNX Runtime

Usage

import { pipeline } from '@huggingface/transformers';

const corrector = await pipeline('text2text-generation', 'rabden/t5-tiny-gec-hone', {
  quantized: true,
  dtype: 'q8',
});

const result = await corrector('he go to school yesterday', {
  max_new_tokens: 64,
  temperature: 0,
  do_sample: false,
});
// "He went to school yesterday."

Performance

Measured on CPU (Node.js, ONNX Runtime, INT8 quantized):

Input Output Time
he go to school yesterday He went to school yesterday. ~115ms
she dont like apples She doesn't like apples. ~47ms
i have went to the store I have been to the store. ~40ms
they was running fast They were running fast. ~30ms

First-load from HuggingFace Hub: ~18s / Cached load: ~0.7s

Files

File Size
onnx/encoder_model.onnx 43.5 MB
onnx/encoder_model_quantized.onnx 11.0 MB
onnx/decoder_model_merged.onnx 78.9 MB
onnx/decoder_model_merged_quantized.onnx 19.9 MB
config.json โ€”
tokenizer.json 2.3 MB

Notes

The ONNX export uses a custom NonGrowingCache for cross-attention to prevent KV cache doubling across decoder steps. Standard DynamicCache concatenates past and new encoder keys, causing a shape mismatch in cross-attention position bias on the second decoder step.

Downloads last month
101
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support