SLM 500M Common Corpus English
This is a custom GPT-style PyTorch language model.
Dataset: PleIAs/common_corpus
Filter: language == English
Tokenizer: tiktoken GPT-2 encoding
Parameters: approximately 505.2M
Files:
- config.json
- pytorch_model.bin
- training_state.pt
This is not a Transformers AutoModel checkpoint. Load it with your custom GPT and GPTConfig classes from the notebook.
- Downloads last month
- 14
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support