Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,40 @@ base_model:
|
|
6 |
- Qwen/QwQ-32B-Preview
|
7 |
new_version: Qwen/QwQ-32B-Preview
|
8 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
|
10 |
|
11 |
# QwQ-32B-Preview-quantized-autoround-GPTQ-sym-4bit
|
|
|
6 |
- Qwen/QwQ-32B-Preview
|
7 |
new_version: Qwen/QwQ-32B-Preview
|
8 |
---
|
9 |
+
## Evaluation Results
|
10 |
+
|
11 |
+
### Evaluation Metrics
|
12 |
+
|
13 |
+
| **Groups** | **Version** | **Filter** | **n-shot** | **Metric** | **Direction** | **Value** | **Stderr** |
|
14 |
+
|----------------------|:-----------:|:----------:|:----------:|:----------:|:-------------:|----------:|-----------:|
|
15 |
+
| **mmlu** | 2 | none | - | acc | ↑ | 0.8034 | ±0.0032 |
|
16 |
+
| **humanities** | 2 | none | - | acc | ↑ | 0.7275 | ±0.0062 |
|
17 |
+
| **other** | 2 | none | - | acc | ↑ | 0.8323 | ±0.0064 |
|
18 |
+
| **social sciences**| 2 | none | - | acc | ↑ | 0.8856 | ±0.0056 |
|
19 |
+
| **stem** | 2 | none | - | acc | ↑ | 0.8081 | ±0.0068 |
|
20 |
+
|
21 |
+
### Description
|
22 |
+
|
23 |
+
- **mmlu**: Overall accuracy across multiple domains.
|
24 |
+
- **humanities**: Accuracy in humanities-related tasks.
|
25 |
+
- **other**: Accuracy in other unspecified domains.
|
26 |
+
- **social sciences**: Accuracy in social sciences-related tasks.
|
27 |
+
- **stem**: Accuracy in STEM (Science, Technology, Engineering, Mathematics) related tasks.
|
28 |
+
|
29 |
+
### Visualization
|
30 |
+
|
31 |
+
If supported, the following Mermaid diagram visualizes the accuracy metrics across different groups:
|
32 |
+
|
33 |
+
```mermaid
|
34 |
+
bar
|
35 |
+
title Accuracy Metrics by Group
|
36 |
+
x-axis Groups
|
37 |
+
y-axis Accuracy
|
38 |
+
"mmlu" : 0.8034
|
39 |
+
"humanities" : 0.7275
|
40 |
+
"other" : 0.8323
|
41 |
+
"social sciences" : 0.8856
|
42 |
+
"stem" : 0.8081
|
43 |
|
44 |
|
45 |
# QwQ-32B-Preview-quantized-autoround-GPTQ-sym-4bit
|