Mula-4x160-v0.1 / README.md
nicholasKluge's picture
Update README.md
3c1a3c3 verified
---
library_name: transformers
tags: []
---
# Mula 4x160m
| Step | Evaluation Loss | Perplexity |
|--------|-----------------|------------|
| 50000 | 3.03 | 20.73 |
| 100000 | 2.84 | 17.14 |
| 150000 | 2.73 | 15.35 |
| 200000 | 2.64 | 14.05 |
| 250000 | 2.56 | 12.95 |
| 300000 | 2.49 | 12.14 |
| 350000 | 2.46 | 11.75 |
| 400000 | 2.46 | 11.72 |
| | **ARC** | **HellaSwag** | **MMLU** | **TruthfulQA** | **Average** |
|------------------|-----------|---------------|-----------|----------------|-------------|
| **TTL-460m** | 29.40 | 33.00 | 28.55 | 41.10 | 33.01 |
| **TTL-160m** | 26.15 | 29.29 | 28.11 | 41.12 | 31.16 |
| **Mula-4x160m** | 27.09 | 31.41 | 28.15 | 39.81 | 31.61 |
| | **ASSIN2 RTE** | **ASSIN2 STS** | **BLUEX** | **ENEM** | **FAQUAD NLI** | **HateBR** | **OAB Exams** | **Average** |
|-------------------|----------------|----------------|-----------|----------|----------------|------------|---------------|-------------|
| **TTL-460m** | 53.93 | 12.66 | 22.81 | 19.87 | 49.01 | 33.59 | 27.06 | 31.27 |
| **TTL-160m** | 53.36 | 2.58 | 21.84 | 18.75 | 43.97 | 36.88 | 22.60 | 28.56 |
| **Mula-4x160m** | 33.55 | 8.88 | 20.58 | 20.08 | 43.97 | 33.65 | 22.92 | 26.23 |