Satwik11 commited on
Commit
6261cc1
1 Parent(s): 40c317f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -6,6 +6,40 @@ base_model:
6
  - Qwen/QwQ-32B-Preview
7
  new_version: Qwen/QwQ-32B-Preview
8
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
 
11
  # QwQ-32B-Preview-quantized-autoround-GPTQ-sym-4bit
 
6
  - Qwen/QwQ-32B-Preview
7
  new_version: Qwen/QwQ-32B-Preview
8
  ---
9
+ ## Evaluation Results
10
+
11
+ ### Evaluation Metrics
12
+
13
+ | **Groups** | **Version** | **Filter** | **n-shot** | **Metric** | **Direction** | **Value** | **Stderr** |
14
+ |----------------------|:-----------:|:----------:|:----------:|:----------:|:-------------:|----------:|-----------:|
15
+ | **mmlu** | 2 | none | - | acc | ↑ | 0.8034 | ±0.0032 |
16
+ |     **humanities** | 2 | none | - | acc | ↑ | 0.7275 | ±0.0062 |
17
+ |     **other** | 2 | none | - | acc | ↑ | 0.8323 | ±0.0064 |
18
+ |     **social sciences**| 2 | none | - | acc | ↑ | 0.8856 | ±0.0056 |
19
+ |     **stem** | 2 | none | - | acc | ↑ | 0.8081 | ±0.0068 |
20
+
21
+ ### Description
22
+
23
+ - **mmlu**: Overall accuracy across multiple domains.
24
+ - **humanities**: Accuracy in humanities-related tasks.
25
+ - **other**: Accuracy in other unspecified domains.
26
+ - **social sciences**: Accuracy in social sciences-related tasks.
27
+ - **stem**: Accuracy in STEM (Science, Technology, Engineering, Mathematics) related tasks.
28
+
29
+ ### Visualization
30
+
31
+ If supported, the following Mermaid diagram visualizes the accuracy metrics across different groups:
32
+
33
+ ```mermaid
34
+ bar
35
+ title Accuracy Metrics by Group
36
+ x-axis Groups
37
+ y-axis Accuracy
38
+ "mmlu" : 0.8034
39
+ "humanities" : 0.7275
40
+ "other" : 0.8323
41
+ "social sciences" : 0.8856
42
+ "stem" : 0.8081
43
 
44
 
45
  # QwQ-32B-Preview-quantized-autoround-GPTQ-sym-4bit