Gerbil-B-3.3m / README.md
crumb's picture
Create README.md
88d7ba0
|
raw
history blame
212 Bytes
Model Name Parameters Class Ratio Tokens Batch Size (Tokens) Training Loss
GerbilLab/Gerbil-B-3.3m 3.3m B-Class 42 126M 65.5k 6.0822