metadata
license: cc-by-nc-3.0
datasets:
- FredZhang7/toxi-text-3M
pipeline_tag: text-classification
I have decided to release the auto-moderation models all at once sometime in July. The curated datasets for training these models will be avaliable first.
Finished training: 6/30/2023
Final Train & Validation Accuracy: 95-98%
Large model (v2) will be avaliable for PyTorch
Lightweight model and tokenizer (v1) will be avaliable for transformers.js
Models tested: roberta, xlm-roberta, bert-tiny, bert-base-cased/uncased, bert-multilingual-cased/uncased, alberta-large-v2
Decision: bert-multilingual-cased