metadata
license: cc-by-sa-4.0
datasets:
- Chrisneverdie/OnlySports_Dataset
language:
- en
pipeline_tag: text-generation
tags:
- Sports
OnlySportsLM
Model Overview
OnlySportsLM is a 196M language model specifically designed and trained for sports-related natural language processing tasks. It is part of the larger OnlySports collection, which aims to advance domain-specific language modeling in sports.
Model Architecture
- Base architecture: RWKV-v6
- Parameters: 196 million
- Structure: 20 layers, 640 dimensions
Training
- Dataset: OnlySports Dataset (subset of 315B tokens out of 600B total)
- Training setup: 8 H100 GPUs
- Optimizer: AdamW
- Learning rate: Initially 6e-4, adjusted to 1e-4 due to observed loss spikes
- Context length: 1024 tokens
Performance
OnlySportsLM shows impressive performance on sports-related tasks:
- Outperforms previous SOTA 135M/360M models by 37.62%/34.08% on the OnlySports Benchmark
- Competitive with larger models like SomlLM 1.7B and Qwen 1.5B in the sports domain
Usage
You can use this model for various sports-related content generation.
Download all files in this repo. Open RWKV_v6_demo.py for inference.
Limitations
- The model is specifically trained on sports-related content and may not perform as well on general topics
- Training was stopped at 315B tokens due to resource constraints, potentially limiting its full capabilities
Related Resources
Citation
If you use OnlySportsLM in your research, please cite our paper.
Contact
For more information or inquiries about OnlySportsLM, please visit our GitHub repository.