mbrack commited on
Commit
0f3e5ae
1 Parent(s): 8cceabf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -41,15 +41,15 @@ In the below image and corresponding table, you can see the benchmark scores for
41
  | Model | truthful_qa_de | truthfulqa_mc | arc_challenge | arc_challenge_de | hellaswag | hellaswag_de | MMLU | MMLU-DE | mean |
42
  |----------------------------------------------------|----------------|---------------|---------------|------------------|-------------|--------------|-------------|-------------|-------------|
43
  | meta-llama/Meta-Llama-3-8B-Instruct | 0.47498 | 0.43923 | **0.59642** | 0.47952 | **0.82025** | 0.60008 | **0.66658** | 0.53541 | 0.57656 |
44
- | DiscoResearch/Llama3_German_8B | 0.49499 | 0.44838 | 0.55802 | 0.49829 | 0.79924 | 0.65395 | 0.62240 | 0.54413 | 0.57743 |
45
- | DiscoResearch/Llama3_German_8B_32k | 0.48920 | 0.45138 | 0.54437 | 0.49232 | 0.79078 | 0.64310 | 0.58774 | 0.47971 | 0.55982 |
46
- | DiscoResearch/Llama3_DiscoLeo_Instruct_8B_v0.1 | **0.53042** | 0.52867 | 0.59556 | **0.53839** | 0.80721 | 0.66440 | 0.61898 | 0.56053 | **0.60552** |
47
- | **DiscoResearch/Llama3_DiscoLeo_Instruct_8B_32k_v0.1** | 0.52749 | **0.53245** | 0.58788 | 0.53754 | 0.80770 | **0.66709** | 0.62123 | **0.56238** | 0.60547 |
48
 
49
  ## Model Configurations
50
 
51
  We release DiscoLeo-8B in the following configurations:
52
- 1. [Base model with continued pretraining](https://huggingface.co/DiscoResearch/Llama3_German_8B)
53
  2. [Long-context version (32k context length)](https://huggingface.co/DiscoResearch/Llama3_German_8B_32k)
54
  3. [Instruction-tuned version of the base model](https://huggingface.co/DiscoResearch/Llama3_DiscoLeo_Instruct_8B_v0.1)
55
  4. [Instruction-tuned version of the long-context model](https://huggingface.co/DiscoResearch/Llama3_DiscoLeo_Instruct_8B_32k_v0.1) (This model)
@@ -62,11 +62,11 @@ Here's how to use the model with transformers:
62
  from transformers import AutoModelForCausalLM, AutoTokenizer
63
 
64
  model = AutoModelForCausalLM.from_pretrained(
65
- "DiscoResearch/Llama3_DiscoLeo_Instruct_8B_v0.1",
66
  torch_dtype="auto",
67
  device_map="auto"
68
  )
69
- tokenizer = AutoTokenizer.from_pretrained("DiscoResearch/Llama3_DiscoLeo_Instruct_8B_32k_v0.1")
70
 
71
  prompt = "Schreibe ein Essay über die Bedeutung der Energiewende für Deutschlands Wirtschaft"
72
  messages = [
 
41
  | Model | truthful_qa_de | truthfulqa_mc | arc_challenge | arc_challenge_de | hellaswag | hellaswag_de | MMLU | MMLU-DE | mean |
42
  |----------------------------------------------------|----------------|---------------|---------------|------------------|-------------|--------------|-------------|-------------|-------------|
43
  | meta-llama/Meta-Llama-3-8B-Instruct | 0.47498 | 0.43923 | **0.59642** | 0.47952 | **0.82025** | 0.60008 | **0.66658** | 0.53541 | 0.57656 |
44
+ | DiscoResearch/Llama3-German-8B | 0.49499 | 0.44838 | 0.55802 | 0.49829 | 0.79924 | 0.65395 | 0.62240 | 0.54413 | 0.57743 |
45
+ | DiscoResearch/Llama3-German-8B-32k | 0.48920 | 0.45138 | 0.54437 | 0.49232 | 0.79078 | 0.64310 | 0.58774 | 0.47971 | 0.55982 |
46
+ | DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1 | **0.53042** | 0.52867 | 0.59556 | **0.53839** | 0.80721 | 0.66440 | 0.61898 | 0.56053 | **0.60552** |
47
+ | **DiscoResearch/Llama3-DiscoLeo-Instruct-8B-32k-v0.1** | 0.52749 | **0.53245** | 0.58788 | 0.53754 | 0.80770 | **0.66709** | 0.62123 | **0.56238** | 0.60547 |
48
 
49
  ## Model Configurations
50
 
51
  We release DiscoLeo-8B in the following configurations:
52
+ 1. [Base model with continued pretraining](https://huggingface.co/DiscoResearch/Llama3-German_8B)
53
  2. [Long-context version (32k context length)](https://huggingface.co/DiscoResearch/Llama3_German_8B_32k)
54
  3. [Instruction-tuned version of the base model](https://huggingface.co/DiscoResearch/Llama3_DiscoLeo_Instruct_8B_v0.1)
55
  4. [Instruction-tuned version of the long-context model](https://huggingface.co/DiscoResearch/Llama3_DiscoLeo_Instruct_8B_32k_v0.1) (This model)
 
62
  from transformers import AutoModelForCausalLM, AutoTokenizer
63
 
64
  model = AutoModelForCausalLM.from_pretrained(
65
+ "DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1",
66
  torch_dtype="auto",
67
  device_map="auto"
68
  )
69
+ tokenizer = AutoTokenizer.from_pretrained("DiscoResearch/Llama3-DiscoLeo-Instruct-8B-32k-v0.1")
70
 
71
  prompt = "Schreibe ein Essay über die Bedeutung der Energiewende für Deutschlands Wirtschaft"
72
  messages = [