bert-base-uncased finetuned on the AG News dataset using PyTorch Lightning. Sequence length 128, learning rate 2e-5, batch size 32, 4 T4 GPUs, 4 epochs. The code can be found here
Limitations and bias
- Not the best model...
Data came from HuggingFace's
datasets package. The data can be viewed on nlp viewer.
- Downloads last month