teknium
/

OpenHermes-13B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

teknium commited on Sep 7, 2023

Commit

3d6dd25

•

1 Parent(s): 5683186

Update README.md

Files changed (1) hide show

README.md +25 -16

README.md CHANGED Viewed

@@ -1,28 +1,43 @@
 ---
 base_model: NousResearch/Llama-2-13b-hf
 tags:
-- generated_from_trainer
 model-index:
-- name: openhermes-7b
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# openhermes-7b
-This model is a fine-tuned version of [NousResearch/Llama-2-13b-hf](https://huggingface.co/NousResearch/Llama-2-13b-hf) on the None dataset.
 ## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
 More information needed
@@ -33,25 +48,19 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 2
-- eval_batch_size: 2
 - seed: 42
 - distributed_type: multi-GPU
 - num_devices: 8
 - gradient_accumulation_steps: 8
 - total_train_batch_size: 128
-- total_eval_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 300
 - num_epochs: 3
-### Training results
 ### Framework versions
 - Transformers 4.34.0.dev0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.4
-- Tokenizers 0.13.3

 ---
 base_model: NousResearch/Llama-2-13b-hf
 tags:
+- llama-2
+- instruct
+- finetune
+- alpaca
+- gpt4
+- synthetic data
 model-index:
+- name: openhermes-13b
   results: []
+license: mit
+language:
+- en
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# OpenHermes-13B
 ## Model description
+OpenHermes 13B is the first fine tune of the Hermes dataset that has a fully open source dataset!
+OpenHermes was trained on 242,000 entries of primarily GPT-4 generated data, from open datasets across the AI landscape, including:
+- GPTeacher - General Instruct, Roleplay v1, Roleplay v2, and Code Instruct Datasets, by Teknium
+- WizardLM (v1, evol_instruct 70k), by WizardLM Team/nlpxucan
+- Airoboros GPT-4 (v1.0), by JonDurbin
+- Camel-AI's domain expert datasets, by the Camel-AI Team
+- CodeAlpaca, by Sahil2801
+- GPT4-LLM and Unnatural Instructions, by Microsoft
+Filtering included removal of OpenAI refusals, disclaimers, and "As an AI" type examples and more
+The base dataset mix the model was trained on is identical to Nous-Hermes', minus the Nous-Instruct and PDACTL datasets which were private datasets.
+## Benchmark Information
 More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 2
 - seed: 42
 - distributed_type: multi-GPU
 - num_devices: 8
 - gradient_accumulation_steps: 8
 - total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 300
 - num_epochs: 3
 ### Framework versions
 - Transformers 4.34.0.dev0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.4
+- Tokenizers 0.13.3