Model Details
- Architecture: Basic/default GPT-2, decoder only
- Num params: ~204M
- Num tokens seen: ~1.3 B
- Dataset: USPTO subset of The Pile interleaved with PubMed Abstracts subset of The Pile.
- Interleaved with probabilities [0.5, 0.5], respectively (first argument is for USPTO, second is for PubMedAbs)
- Downloads last month
- 2