imone commited on
Commit
cc70818
1 Parent(s): ee7406d

[doc] update README

Browse files
Files changed (1) hide show
  1. README.md +27 -27
README.md CHANGED
@@ -33,35 +33,23 @@ To deploy the server as an online service, use `--api-keys sk-KEY1 sk-KEY2 ...`
33
  curl http://localhost:18888/v1/chat/completions \
34
  -H "Content-Type: application/json" \
35
  -d '{
36
- "model": "openchat_v3.1_llama2",
37
  "messages": [{"role": "user", "content": "You are a large language model named OpenChat. Write a poem to describe yourself"}]
38
  }'
39
  ```
40
 
41
  </details>
42
 
43
- | Model | Size | Context | Weights | Serving |
44
- |---------------|------|---------|-------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|
45
- | OpenChat 3.1 | 13B | 4096 | [Huggingface](https://huggingface.co/openchat/openchat_v3.1) | `python -m ochat.serving.openai_api_server --model-type openchat_v3.1_llama2 --model openchat/openchat_v3.1 --engine-use-ray --worker-use-ray --max-num-batched-tokens 5120` |
46
- | OpenChat 3.2 | 13B | 4096 | [Huggingface](https://huggingface.co/openchat/openchat_v3.2) | `python -m ochat.serving.openai_api_server --model-type openchat_v3.2 --model openchat/openchat_v3.2 --engine-use-ray --worker-use-ray --max-num-batched-tokens 5120` |
47
 
48
  For inference with Huggingface Transformers (slow and not recommended), follow the conversation template provided below:
49
 
50
  <details>
51
  <summary>Conversation templates (click to expand)</summary>
52
 
53
- V3.1
54
-
55
- ```python
56
- # Single-turn V3.1
57
- tokenize("Assistant is GPT4<|end_of_turn|>User: Hello<|end_of_turn|>Assistant:")
58
- # Result: [1, 4007, 22137, 338, 402, 7982, 29946, 32000, 4911, 29901, 15043, 32000, 4007, 22137, 29901]
59
-
60
- # Multi-turn V3.1
61
- tokenize("Assistant is GPT4<|end_of_turn|>User: Hello<|end_of_turn|>Assistant: Hi<|end_of_turn|>User: How are you today?<|end_of_turn|>Assistant:")
62
- # Result: [1, 4007, 22137, 338, 402, 7982, 29946, 32000, 4911, 29901, 15043, 32000, 4007, 22137, 29901, 6324, 32000, 4911, 29901, 1128, 526, 366, 9826, 29973, 32000, 4007, 22137, 29901]
63
- ```
64
-
65
  V3.2
66
 
67
  ```python
@@ -74,6 +62,18 @@ tokenize("GPT4 User: Hello<|end_of_turn|>GPT4 Assistant: Hi<|end_of_turn|>GPT4 U
74
  # Result: [1, 402, 7982, 29946, 4911, 29901, 15043, 32000, 402, 7982, 29946, 4007, 22137, 29901, 6324, 32000, 402, 7982, 29946, 4911, 29901, 1128, 526, 366, 9826, 29973, 32000, 402, 7982, 29946, 4007, 22137, 29901]
75
  ```
76
 
 
 
 
 
 
 
 
 
 
 
 
 
77
  </details>
78
 
79
  ## <a id="benchmarks"></a> Benchmarks
@@ -82,16 +82,16 @@ We have evaluated our models using the two most popular evaluation benchmarks **
82
 
83
  To ensure consistency, we used the same routine as ChatGPT / GPT-4 to run these benchmarks. We started the OpenAI API-compatible server and set the `openai.api_base` to `http://localhost:18888/v1` in the benchmark program.
84
 
85
- | **Model** | **Size** | **Context** | **💲Free** | **AlpacaEval (win rate %)** | **MT-bench (score)** | **MT-bench (win rate adjusted %)** |
86
- |------------------|----------|-------------|-----------|-----------------------------|----------------------|------------------------------------|
87
- | | | | | **v.s. text-davinci-003** | | **v.s. ChatGPT** |
88
- | GPT-4 | 1.8T* | 8K | �� | 95.3 | 8.99 | 82.5 |
89
- | ChatGPT | 175B* | 4K | ❌ | 89.4 | 7.94 | 50.0 |
90
- | Llama-2-70B-Chat | 70B | 4K | ✅ | 92.7 | 6.86 | |
91
- | **OpenChat 3.1** | 13B | 4K | ✅ | **89.5** | **6.65** | **50.0** |
92
- | **OpenChat 3.2** | 13B | 4K | ✅ | **89.1** | **7.01** | **51.6** |
93
- | Llama-2-13B-Chat | 13B | 4K | ✅ | 81.0 | 6.65 | |
94
- | Vicuna 1.3 | 13B | 2K | ❌ | 82.1 | 6.00 | 37.5 |
95
 
96
  *: Estimated model size
97
 
 
33
  curl http://localhost:18888/v1/chat/completions \
34
  -H "Content-Type: application/json" \
35
  -d '{
36
+ "model": "openchat_v3.2",
37
  "messages": [{"role": "user", "content": "You are a large language model named OpenChat. Write a poem to describe yourself"}]
38
  }'
39
  ```
40
 
41
  </details>
42
 
43
+ | Model | Size | Context | Weights | Serving |
44
+ |--------------|------|---------|--------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
45
+ | OpenChat 3.2 | 13B | 4096 | [Huggingface](https://huggingface.co/openchat/openchat_v3.2) | `python -m ochat.serving.openai_api_server --model-type openchat_v3.2 --model openchat/openchat_v3.2 --engine-use-ray --worker-use-ray --max-num-batched-tokens 5120` |
46
+ | OpenChat 3.1 | 13B | 4096 | [Huggingface](https://huggingface.co/openchat/openchat_v3.1) | `python -m ochat.serving.openai_api_server --model-type openchat_v3.1_llama2 --model openchat/openchat_v3.1 --engine-use-ray --worker-use-ray --max-num-batched-tokens 5120` |
47
 
48
  For inference with Huggingface Transformers (slow and not recommended), follow the conversation template provided below:
49
 
50
  <details>
51
  <summary>Conversation templates (click to expand)</summary>
52
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  V3.2
54
 
55
  ```python
 
62
  # Result: [1, 402, 7982, 29946, 4911, 29901, 15043, 32000, 402, 7982, 29946, 4007, 22137, 29901, 6324, 32000, 402, 7982, 29946, 4911, 29901, 1128, 526, 366, 9826, 29973, 32000, 402, 7982, 29946, 4007, 22137, 29901]
63
  ```
64
 
65
+ V3.1
66
+
67
+ ```python
68
+ # Single-turn V3.1
69
+ tokenize("Assistant is GPT4<|end_of_turn|>User: Hello<|end_of_turn|>Assistant:")
70
+ # Result: [1, 4007, 22137, 338, 402, 7982, 29946, 32000, 4911, 29901, 15043, 32000, 4007, 22137, 29901]
71
+
72
+ # Multi-turn V3.1
73
+ tokenize("Assistant is GPT4<|end_of_turn|>User: Hello<|end_of_turn|>Assistant: Hi<|end_of_turn|>User: How are you today?<|end_of_turn|>Assistant:")
74
+ # Result: [1, 4007, 22137, 338, 402, 7982, 29946, 32000, 4911, 29901, 15043, 32000, 4007, 22137, 29901, 6324, 32000, 4911, 29901, 1128, 526, 366, 9826, 29973, 32000, 4007, 22137, 29901]
75
+ ```
76
+
77
  </details>
78
 
79
  ## <a id="benchmarks"></a> Benchmarks
 
82
 
83
  To ensure consistency, we used the same routine as ChatGPT / GPT-4 to run these benchmarks. We started the OpenAI API-compatible server and set the `openai.api_base` to `http://localhost:18888/v1` in the benchmark program.
84
 
85
+ | **Model** | **Size** | **Context** | **💲Free** | **AlpacaEval (win rate %)** | **MT-bench (win rate adjusted %)** | **MT-bench (score)** |
86
+ |------------------|----------|-------------|------------|-----------------------------|------------------------------------|----------------------|
87
+ | | | | | **v.s. text-davinci-003** | **v.s. ChatGPT** | |
88
+ | GPT-4 | 1.8T* | 8K | | 95.3 | 82.5 | 8.99 |
89
+ | ChatGPT | 175B* | 4K | ❌ | 89.4 | 50.0 | 7.94 |
90
+ | Llama-2-70B-Chat | 70B | 4K | ✅ | 92.7 | | 6.86 |
91
+ | **OpenChat 3.2** | **13B** | **4K** | ✅ | **89.1** | **51.6** | **7.01** |
92
+ | **OpenChat 3.1** | **13B** | **4K** | ✅ | **89.5** | **50.0** | **6.65** |
93
+ | Llama-2-13B-Chat | 13B | 4K | ✅ | 81.0 | | 6.65 |
94
+ | Vicuna 1.3 | 13B | 2K | ❌ | 82.1 | 37.5 | 6.00 |
95
 
96
  *: Estimated model size
97