teknium commited on
Commit
3d6dd25
1 Parent(s): 5683186

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -16
README.md CHANGED
@@ -1,28 +1,43 @@
1
  ---
2
  base_model: NousResearch/Llama-2-13b-hf
3
  tags:
4
- - generated_from_trainer
 
 
 
 
 
5
  model-index:
6
- - name: openhermes-7b
7
  results: []
 
 
 
8
  ---
9
 
10
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
  should probably proofread and complete it, then remove this comment. -->
12
 
13
- # openhermes-7b
14
-
15
- This model is a fine-tuned version of [NousResearch/Llama-2-13b-hf](https://huggingface.co/NousResearch/Llama-2-13b-hf) on the None dataset.
16
 
17
  ## Model description
18
 
19
- More information needed
20
 
21
- ## Intended uses & limitations
22
 
23
- More information needed
 
 
 
 
 
 
 
24
 
25
- ## Training and evaluation data
 
 
26
 
27
  More information needed
28
 
@@ -33,25 +48,19 @@ More information needed
33
  The following hyperparameters were used during training:
34
  - learning_rate: 2e-05
35
  - train_batch_size: 2
36
- - eval_batch_size: 2
37
  - seed: 42
38
  - distributed_type: multi-GPU
39
  - num_devices: 8
40
  - gradient_accumulation_steps: 8
41
  - total_train_batch_size: 128
42
- - total_eval_batch_size: 16
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: cosine
45
  - lr_scheduler_warmup_steps: 300
46
  - num_epochs: 3
47
 
48
- ### Training results
49
-
50
-
51
-
52
  ### Framework versions
53
 
54
  - Transformers 4.34.0.dev0
55
  - Pytorch 2.0.1+cu118
56
  - Datasets 2.14.4
57
- - Tokenizers 0.13.3
 
1
  ---
2
  base_model: NousResearch/Llama-2-13b-hf
3
  tags:
4
+ - llama-2
5
+ - instruct
6
+ - finetune
7
+ - alpaca
8
+ - gpt4
9
+ - synthetic data
10
  model-index:
11
+ - name: openhermes-13b
12
  results: []
13
+ license: mit
14
+ language:
15
+ - en
16
  ---
17
 
18
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
19
  should probably proofread and complete it, then remove this comment. -->
20
 
21
+ # OpenHermes-13B
 
 
22
 
23
  ## Model description
24
 
25
+ OpenHermes 13B is the first fine tune of the Hermes dataset that has a fully open source dataset!
26
 
27
+ OpenHermes was trained on 242,000 entries of primarily GPT-4 generated data, from open datasets across the AI landscape, including:
28
 
29
+ - GPTeacher - General Instruct, Roleplay v1, Roleplay v2, and Code Instruct Datasets, by Teknium
30
+ - WizardLM (v1, evol_instruct 70k), by WizardLM Team/nlpxucan
31
+ - Airoboros GPT-4 (v1.0), by JonDurbin
32
+ - Camel-AI's domain expert datasets, by the Camel-AI Team
33
+ - CodeAlpaca, by Sahil2801
34
+ - GPT4-LLM and Unnatural Instructions, by Microsoft
35
+
36
+ Filtering included removal of OpenAI refusals, disclaimers, and "As an AI" type examples and more
37
 
38
+ The base dataset mix the model was trained on is identical to Nous-Hermes', minus the Nous-Instruct and PDACTL datasets which were private datasets.
39
+
40
+ ## Benchmark Information
41
 
42
  More information needed
43
 
 
48
  The following hyperparameters were used during training:
49
  - learning_rate: 2e-05
50
  - train_batch_size: 2
 
51
  - seed: 42
52
  - distributed_type: multi-GPU
53
  - num_devices: 8
54
  - gradient_accumulation_steps: 8
55
  - total_train_batch_size: 128
 
56
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
57
  - lr_scheduler_type: cosine
58
  - lr_scheduler_warmup_steps: 300
59
  - num_epochs: 3
60
 
 
 
 
 
61
  ### Framework versions
62
 
63
  - Transformers 4.34.0.dev0
64
  - Pytorch 2.0.1+cu118
65
  - Datasets 2.14.4
66
+ - Tokenizers 0.13.3