xww033 commited on
Commit
a9e03c5
1 Parent(s): 3de4cd6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -22,7 +22,7 @@ The model is tuned after 4 iterations of online alignment. In each iteration, we
22
 
23
  - Step 3: Apply CUT to fine-tune the target model with the above instruction-response-judgment triplets.
24
 
25
- We use [LLaMA2-chat-13b](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf) as the base LLM. In each iteration, we sample 1000 instructions from [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca).
26
  To avoid over-fitting, we ensure that the sampled data are different in each iteration.
27
  We then ask GPT4 for the judgment annotation.
28
 
@@ -40,7 +40,7 @@ Below is an instruction that describes a task. Write a response that appropriate
40
 
41
  ### 3. How to use
42
 
43
- #### 1. Huggingface
44
 
45
  ```python
46
  import torch
@@ -63,7 +63,7 @@ text = tokenizer.batch_decode(outputs)[0]
63
  print(text)
64
  ```
65
 
66
- #### 2. FastChat
67
 
68
  [Fastchat](https://github.com/lm-sys/FastChat) provides a simple setup for those interested in trying our aligned model. After downloading the [CUT model](https://huggingface.co/xww033/cut-13b) through HuggingFace, clone the Fastchat repository:
69
 
 
22
 
23
  - Step 3: Apply CUT to fine-tune the target model with the above instruction-response-judgment triplets.
24
 
25
+ Specifically, we use [LLaMA2-chat-13b](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf) as the base LLM. In each iteration, we sample 1000 instructions from [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca).
26
  To avoid over-fitting, we ensure that the sampled data are different in each iteration.
27
  We then ask GPT4 for the judgment annotation.
28
 
 
40
 
41
  ### 3. How to use
42
 
43
+ #### 3.1. Huggingface
44
 
45
  ```python
46
  import torch
 
63
  print(text)
64
  ```
65
 
66
+ #### 3.2. FastChat
67
 
68
  [Fastchat](https://github.com/lm-sys/FastChat) provides a simple setup for those interested in trying our aligned model. After downloading the [CUT model](https://huggingface.co/xww033/cut-13b) through HuggingFace, clone the Fastchat repository:
69