File size: 13,128 Bytes
544606b
c443608
da9fde0
544606b
 
da9fde0
2314c91
2821883
2314c91
 
544606b
2314c91
 
 
 
 
4200ed6
2314c91
 
 
a8f7ea6
 
 
 
2314c91
 
 
2821883
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2314c91
2821883
 
2314c91
 
7001a45
 
 
2821883
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
datasets: wikitext
license: other
license_link: https://llama.meta.com/llama3/license/
---
This is a quantized model of [Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) using GPTQ developed by [IST Austria](https://ist.ac.at/en/research/alistarh-group/)
 using the following configuration:
 - 4bit 
- Act order: True
 - Group size: 128

## Usage
Install **vLLM** and 
    run the [server](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#openai-compatible-server):
    
```
python -m vllm.entrypoints.openai.api_server --model cortecs/Meta-Llama-3-70B-Instruct-GPTQ
```
Access the model:
```
curl http://localhost:8000/v1/completions     -H "Content-Type: application/json"     -d ' {
        "model": "cortecs/Meta-Llama-3-70B-Instruct-GPTQ",
        "prompt": "San Francisco is a"
    } '
```

## Evaluations
| __English__   | __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__   | __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__   | __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__   |
|:--------------|:-----------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------|
| Avg.          | 76.19                                                                                          | 76.16                                                                                                       | 75.14                                                                                                 |
| ARC           | 71.6                                                                                           | 71.4                                                                                                        | 70.7                                                                                                  |
| Hellaswag     | 77.3                                                                                           | 77.1                                                                                                        | 76.4                                                                                                  |
| MMLU          | 79.66                                                                                          | 79.98                                                                                                       | 78.33                                                                                                 |
|               |                                                                                                |                                                                                                             |                                                                                                       |
| __French__   | __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__   | __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__   | __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__   |
| Avg.         | 70.97                                                                                          | 71.03                                                                                                       | 70.27                                                                                                 |
| ARC_fr       | 65.0                                                                                           | 65.3                                                                                                        | 64.7                                                                                                  |
| Hellaswag_fr | 72.4                                                                                           | 72.4                                                                                                        | 71.4                                                                                                  |
| MMLU_fr      | 75.5                                                                                           | 75.4                                                                                                        | 74.7                                                                                                  |
|              |                                                                                                |                                                                                                             |                                                                                                       |
| __German__   | __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__   | __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__   | __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__   |
| Avg.         | 68.43                                                                                          | 68.37                                                                                                       | 66.93                                                                                                 |
| ARC_de       | 64.2                                                                                           | 64.3                                                                                                        | 62.6                                                                                                  |
| Hellaswag_de | 67.8                                                                                           | 67.7                                                                                                        | 66.7                                                                                                  |
| MMLU_de      | 73.3                                                                                           | 73.1                                                                                                        | 71.5                                                                                                  |
|              |                                                                                                |                                                                                                             |                                                                                                       |
| __Italian__   | __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__   | __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__   | __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__   |
| Avg.          | 70.17                                                                                          | 70.43                                                                                                       | 68.63                                                                                                 |
| ARC_it        | 64.0                                                                                           | 64.3                                                                                                        | 62.1                                                                                                  |
| Hellaswag_it  | 72.6                                                                                           | 72.4                                                                                                        | 71.0                                                                                                  |
| MMLU_it       | 73.9                                                                                           | 74.6                                                                                                        | 72.8                                                                                                  |
|               |                                                                                                |                                                                                                             |                                                                                                       |
| __Safety__          | __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__   | __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__   | __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__   |
| Avg.                | 64.28                                                                                          | 64.17                                                                                                       | 63.64                                                                                                 |
| RealToxicityPrompts | 97.9                                                                                           | 97.8                                                                                                        | 98.1                                                                                                  |
| TruthfulQA          | 61.91                                                                                          | 61.67                                                                                                       | 59.91                                                                                                 |
| CrowS               | 33.04                                                                                          | 33.04                                                                                                       | 32.92                                                                                                 |
|                     |                                                                                                |                                                                                                             |                                                                                                       |
| __Spanish__   |   __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__ |   __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__ |   __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__ |
| Avg.          |                                                                                           72.5 |                                                                                                        72.7 |                                                                                                  71.3 |
| ARC_es        |                                                                                           66.7 |                                                                                                        66.9 |                                                                                                  65.7 |
| Hellaswag_es  |                                                                                           75.8 |                                                                                                        75.9 |                                                                                                  74   |
| MMLU_es       |                                                                                           75   |                                                                                                        75.3 |                                                                                                  74.2 |

We did not check for data contamination.
     Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) using `limit=1000`. 
    
## Performance
|               |   requests/s |   tokens/s |
|:--------------|-------------:|-----------:|
| NVIDIA L40Sx2 |            2 |     951.28 |
Performance measured on [cortecs inference](https://cortecs.ai).