T5 Tiny GEC โ Grammar Error Correction
Tiny grammar correction model exported to ONNX for Transformers.js v3. Based on visheratin/t5-efficient-tiny-grammar-correction.
- 4 decoder layers,
d_model=256,num_heads=4 - Encoder: 43MB FP32 / 11MB INT8
- Decoder: 79MB FP32 / 20MB INT8
- Runs in-browser or Node.js via ONNX Runtime
Usage
import { pipeline } from '@huggingface/transformers';
const corrector = await pipeline('text2text-generation', 'rabden/t5-tiny-gec-hone', {
quantized: true,
dtype: 'q8',
});
const result = await corrector('he go to school yesterday', {
max_new_tokens: 64,
temperature: 0,
do_sample: false,
});
// "He went to school yesterday."
Performance
Measured on CPU (Node.js, ONNX Runtime, INT8 quantized):
| Input | Output | Time |
|---|---|---|
| he go to school yesterday | He went to school yesterday. | ~115ms |
| she dont like apples | She doesn't like apples. | ~47ms |
| i have went to the store | I have been to the store. | ~40ms |
| they was running fast | They were running fast. | ~30ms |
First-load from HuggingFace Hub: ~18s / Cached load: ~0.7s
Files
| File | Size |
|---|---|
onnx/encoder_model.onnx |
43.5 MB |
onnx/encoder_model_quantized.onnx |
11.0 MB |
onnx/decoder_model_merged.onnx |
78.9 MB |
onnx/decoder_model_merged_quantized.onnx |
19.9 MB |
config.json |
โ |
tokenizer.json |
2.3 MB |
Notes
The ONNX export uses a custom NonGrowingCache for cross-attention to prevent KV cache doubling across decoder steps. Standard DynamicCache concatenates past and new encoder keys, causing a shape mismatch in cross-attention position bias on the second decoder step.
- Downloads last month
- 101
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support