Text Generation
Transformers
PyTorch
English
llama
text-generation-inference
Inference Endpoints
hamishivi commited on
Commit
00ae9d3
·
1 Parent(s): 071a3f2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -4,6 +4,7 @@ model-index:
4
  results: []
5
  datasets:
6
  - anon8231489123/ShareGPT_Vicuna_unfiltered
 
7
  language:
8
  - en
9
  base_model: meta-llama/Llama-2-7b-hf
@@ -15,7 +16,8 @@ base_model: meta-llama/Llama-2-7b-hf
15
  # Model Card for Open Instruct ShareGPT DPO Llama2 7B
16
 
17
  This model belongs to the Tulu series of models, which is a series of language models that are trained to act as helpful assistants.
18
- Open Instruct ShareGPT Llama2 7B is a fine-tuned version of Llama 2 that was trained on the [ShareGPT dataset](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered).
 
19
  Please check out our paper [TODO] for more!
20
 
21
 
 
4
  results: []
5
  datasets:
6
  - anon8231489123/ShareGPT_Vicuna_unfiltered
7
+ - HuggingFaceH4/ultrafeedback_binarized
8
  language:
9
  - en
10
  base_model: meta-llama/Llama-2-7b-hf
 
16
  # Model Card for Open Instruct ShareGPT DPO Llama2 7B
17
 
18
  This model belongs to the Tulu series of models, which is a series of language models that are trained to act as helpful assistants.
19
+ Open Instruct ShareGPT Llama2 7B is initially fine-tuned version of Llama 2 that was trained on the [ShareGPT dataset](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered).
20
+ The model was then further trained on the UltraFeedback dataset using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).
21
  Please check out our paper [TODO] for more!
22
 
23