sarath-shekkizhar commited on
Commit
1edad18
1 Parent(s): c3c7ee0

Update README.md

Browse files

Fixing broken link

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -128,7 +128,7 @@ These benchmarks test reasoning and knowledge in various tasks in few-shot setti
128
  | Mistral-7B | 62.4 | 74.0 | 38.1 | 57.2 | 62.8 | 37.8 | 55.38 |
129
  | OpenLLM Leader-7B | 64.3 | 78.7 | 73.3 | 66.6 | 68.4 | 58.5 | 68.3 |
130
 
131
- **Note:** While the Open LLM Leaderboard indicates that these chat models perform less effectively compared to the leading 7B model, it's important to note that the leading model struggles in the multi-turn chat setting of MT-Bench (as demonstrated in our evaluation [above](https://www.notion.so/TenyxChat-Language-Model-Alignment-using-Tenyx-Fine-tuning-30e60a53d17a46b0a4755c74f0f8b222?pvs=21)). In contrast, TenyxChat-7B-v1 demonstrates robustness against common fine-tuning challenges, such as *catastrophic forgetting*. This unique feature enables TenyxChat-7B-v1 to excel not only in chat benchmarks like MT-Bench, but also in a wider range of general reasoning benchmarks on the Open LLM Leaderboard.
132
 
133
  # Limitations
134
 
 
128
  | Mistral-7B | 62.4 | 74.0 | 38.1 | 57.2 | 62.8 | 37.8 | 55.38 |
129
  | OpenLLM Leader-7B | 64.3 | 78.7 | 73.3 | 66.6 | 68.4 | 58.5 | 68.3 |
130
 
131
+ **Note:** While the Open LLM Leaderboard indicates that these chat models perform less effectively compared to the leading 7B model, it's important to note that the leading model struggles in the multi-turn chat setting of MT-Bench (as demonstrated in our evaluation [above](#comparison-with-additional-open-llm-leaderboard-models)). In contrast, TenyxChat-7B-v1 demonstrates robustness against common fine-tuning challenges, such as *catastrophic forgetting*. This unique feature enables TenyxChat-7B-v1 to excel not only in chat benchmarks like MT-Bench, but also in a wider range of general reasoning benchmarks on the Open LLM Leaderboard.
132
 
133
  # Limitations
134