cortecs
/

Meta-Llama-3-70B-Instruct-GPTQ

@@ -66,15 +66,6 @@ Take with caution. We did not check for data contamination.
      Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) using `limit=1000` for big datasets.
 ## Performance
-| __Llama-3 70B Instruct__   | __requests/s__   | __tokens/s__   |
-|:---------------------------|:-----------------|:---------------|
-| NVIDIA L40Sx4              | 2.38             | 1135.41        |
-|                            |                  |                |
-| __Llama 3 70B GPTQ__   | __requests/s__   | __tokens/s__   |
-| NVIDIA L40Sx2          | 2.0              | 951.28         |
-|                        |                  |                |
-| __Llama-3 8B Instruct__   |   __requests/s__ |   __tokens/s__ |
-| NVIDIA L40Sx1             |            11.64 |        5548.63 |
-| NVIDIA L4x1               |             2.76 |        1315.25 |
-| NVIDIA L4x2               |             4.79 |        2283.53 |
-Performance was measured on [cortecs.ai](https://cortecs.ai).

      Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) using `limit=1000` for big datasets.
 ## Performance
+|               |   requests/s |   tokens/s |
+|:--------------|-------------:|-----------:|
+| NVIDIA L40Sx2 |            2 |     951.28 |