mlabonne commited on
Commit
5c820d1
β€’
1 Parent(s): 8f2837c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -4,6 +4,7 @@ tags:
4
  - merge
5
  - mergekit
6
  - lazymergekit
 
7
  base_model:
8
  - NousResearch/Meta-Llama-3-8B-Instruct
9
  - mlabonne/OrpoLlama-3-8B
@@ -11,11 +12,11 @@ base_model:
11
  - abacusai/Llama-3-Smaug-8B
12
  ---
13
 
14
- # Chimera-8B
15
 
16
- Chimera-8B outperforms Llama 3 8B Instruct on Nous' benchmark suite.
17
 
18
- Chimera-8B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
19
  * [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
20
  * [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B)
21
  * [Locutusque/Llama-3-Orca-1.0-8B](https://huggingface.co/Locutusque/Llama-3-Orca-1.0-8B)
@@ -29,7 +30,7 @@ Evaluation performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoev
29
 
30
  | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
31
  | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------: | --------: | --------: | ---------: | --------: |
32
- | [**mlabonne/Chimera-8B**](https://huggingface.co/mlabonne/Chimera-8B) [πŸ“„](https://gist.github.com/mlabonne/28d31153628dccf781b74f8071c7c7e4) | **51.58** | **39.12** | **71.81** | **52.4** | **42.98** |
33
  | [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) [πŸ“„](https://gist.github.com/mlabonne/8329284d86035e6019edb11eb0933628) | 51.34 | 41.22 | 69.86 | 51.65 | 42.64 |
34
  | [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B) [πŸ“„](https://gist.github.com/mlabonne/22896a1ae164859931cc8f4858c97f6f) | 48.63 | 34.17 | 70.59 | 52.39 | 37.36 |
35
  | [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [πŸ“„](https://gist.github.com/mlabonne/616b6245137a9cfc4ea80e4c6e55d847) | 45.42 | 31.1 | 69.95 | 43.91 | 36.7 |
@@ -73,7 +74,7 @@ from transformers import AutoTokenizer
73
  import transformers
74
  import torch
75
 
76
- model = "mlabonne/Chimera-8B"
77
  messages = [{"role": "user", "content": "What is a large language model?"}]
78
 
79
  tokenizer = AutoTokenizer.from_pretrained(model)
 
4
  - merge
5
  - mergekit
6
  - lazymergekit
7
+ - llama
8
  base_model:
9
  - NousResearch/Meta-Llama-3-8B-Instruct
10
  - mlabonne/OrpoLlama-3-8B
 
12
  - abacusai/Llama-3-Smaug-8B
13
  ---
14
 
15
+ # ChimeraLlama-3-8B
16
 
17
+ ChimeraLlama-3-8B outperforms Llama 3 8B Instruct on Nous' benchmark suite.
18
 
19
+ ChimeraLlama-3-8B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
20
  * [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
21
  * [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B)
22
  * [Locutusque/Llama-3-Orca-1.0-8B](https://huggingface.co/Locutusque/Llama-3-Orca-1.0-8B)
 
30
 
31
  | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
32
  | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------: | --------: | --------: | ---------: | --------: |
33
+ | [**mlabonne/ChimeraLlama-3-8B**](https://huggingface.co/mlabonne/Chimera-8B) [πŸ“„](https://gist.github.com/mlabonne/28d31153628dccf781b74f8071c7c7e4) | **51.58** | **39.12** | **71.81** | **52.4** | **42.98** |
34
  | [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) [πŸ“„](https://gist.github.com/mlabonne/8329284d86035e6019edb11eb0933628) | 51.34 | 41.22 | 69.86 | 51.65 | 42.64 |
35
  | [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B) [πŸ“„](https://gist.github.com/mlabonne/22896a1ae164859931cc8f4858c97f6f) | 48.63 | 34.17 | 70.59 | 52.39 | 37.36 |
36
  | [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [πŸ“„](https://gist.github.com/mlabonne/616b6245137a9cfc4ea80e4c6e55d847) | 45.42 | 31.1 | 69.95 | 43.91 | 36.7 |
 
74
  import transformers
75
  import torch
76
 
77
+ model = "mlabonne/ChimeraLlama-3-8B"
78
  messages = [{"role": "user", "content": "What is a large language model?"}]
79
 
80
  tokenizer = AutoTokenizer.from_pretrained(model)