patricia-rocha commited on
Commit
81a87ac
1 Parent(s): 6e46ae4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -113,13 +113,14 @@ We then conducted their [automatic evaluation](https://github.com/FreedomIntelli
113
  This prompt was designed to elicit assessments of answers in terms of helpfulness, relevance, accuracy, and level of detail.
114
  [Additional prompts](https://github.com/FreedomIntelligence/LLMZoo/blob/main/llmzoo/eval/prompts/order/prompt_all.json) are provided for assessing overall performance on different perspectives.
115
 
116
- Follows the results against GPT-3.5 and Falcon, one of the highest performing open-source models at the moment:
117
 
118
  * Automatic Evaluation **in Portuguese**:
119
 
120
  | | **Lose** | **Tie** | **Win** |
121
  |------------------------|----------|---------|---------|
122
  | QUOKKA vs. **GPT-3.5** | 63.8% | 10.1% | 26.1% |
 
123
  | QUOKKA vs. **Falcon** | 17.4% | 1.4% | 81.2% |
124
 
125
  ## Environmental impact
 
113
  This prompt was designed to elicit assessments of answers in terms of helpfulness, relevance, accuracy, and level of detail.
114
  [Additional prompts](https://github.com/FreedomIntelligence/LLMZoo/blob/main/llmzoo/eval/prompts/order/prompt_all.json) are provided for assessing overall performance on different perspectives.
115
 
116
+ Follows the results against GPT-3.5, two of the highest performing open-source models at the moment, Vicuna (13B) and Falcon (7B):
117
 
118
  * Automatic Evaluation **in Portuguese**:
119
 
120
  | | **Lose** | **Tie** | **Win** |
121
  |------------------------|----------|---------|---------|
122
  | QUOKKA vs. **GPT-3.5** | 63.8% | 10.1% | 26.1% |
123
+ | QUOKKA vs. **Vicuna** | 66.2% | 8.8% | 25.0% |
124
  | QUOKKA vs. **Falcon** | 17.4% | 1.4% | 81.2% |
125
 
126
  ## Environmental impact