Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,208 @@
|
|
1 |
---
|
2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
|
3 |
+
base_model: openlm-research/open_llama_3b_v2
|
4 |
+
tags:
|
5 |
+
- llama-2
|
6 |
+
- instruct
|
7 |
+
- finetune
|
8 |
+
- alpaca
|
9 |
+
- gpt4
|
10 |
+
- synthetic data
|
11 |
+
- distillation
|
12 |
+
datasets:
|
13 |
+
- teknium/openhermes
|
14 |
+
model-index:
|
15 |
+
- name: openhermes-7b
|
16 |
+
results: []
|
17 |
+
license: apache-2.0
|
18 |
+
language:
|
19 |
+
- en
|
20 |
---
|
21 |
+
|
22 |
+
# Now quantized as Q5_0!
|
23 |
+
|
24 |
+
A huge thank you to the contributors of this beautiful model!
|
25 |
+
|
26 |
+
### About GGUF
|
27 |
+
|
28 |
+
GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
|
29 |
+
|
30 |
+
Here is an incomplete list of clients and libraries that are known to support GGUF:
|
31 |
+
|
32 |
+
* [llama.cpp](https://github.com/ggerganov/llama.cpp). The source project for GGUF. Offers a CLI and a server option.
|
33 |
+
* [text-generation-webui](https://github.com/oobabooga/text-generation-webui), the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
|
34 |
+
* [KoboldCpp](https://github.com/LostRuins/koboldcpp), a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
|
35 |
+
* [GPT4All](https://gpt4all.io/index.html), a free and open source local running GUI, supporting Windows, Linux and macOS with full GPU accel.
|
36 |
+
* [LM Studio](https://lmstudio.ai/), an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. Linux available, in beta as of 27/11/2023.
|
37 |
+
* [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui), a great web UI with many interesting and unique features, including a full model library for easy model selection.
|
38 |
+
* [Faraday.dev](https://faraday.dev/), an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
|
39 |
+
* [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
|
40 |
+
* [candle](https://github.com/huggingface/candle), a Rust ML framework with a focus on performance, including GPU support, and ease of use.
|
41 |
+
* [ctransformers](https://github.com/marella/ctransformers), a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server. Note, as of time of writing (November 27th 2023), ctransformers has not been updated in a long time and does not support many recent models.
|
42 |
+
|
43 |
+
|
44 |
+
|
45 |
+
# OpenHermes-Llama-3B
|
46 |
+
*An OpenLlama-3B finetune for role-play, instruction following, and code generation*
|
47 |
+
|
48 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/648a374f00f7a3374ee64b99/bDQ8ybHVS--CibUf4rvFB.png" width="400" height="400">
|
49 |
+
|
50 |
+
*Logo AI Generated by Dalle*
|
51 |
+
|
52 |
+
For a guide on how this was fine tuned, check out this blog [post](https://calebfahlgren.com/post/how-to-finetune-a-large-language-model).
|
53 |
+
|
54 |
+
# Model description
|
55 |
+
**OpenHermes-Lllama_3B** is a finetune of the Hermes dataset that has a fully open source dataset!
|
56 |
+
It was trained on 15,000 entries of primarily GPT-4 generated data, from open datasets across the AI landscape, including:
|
57 |
+
|
58 |
+
- GPTeacher - General Instruct, Roleplay v1, Roleplay v2, and Code Instruct Datasets, by Teknium
|
59 |
+
- WizardLM (v1, evol_instruct 70k), by WizardLM Team/nlpxucan
|
60 |
+
- Airoboros GPT-4 (v1.0), by JonDurbin
|
61 |
+
- Camel-AI's domain expert datasets, by the Camel-AI Team
|
62 |
+
- CodeAlpaca, by Sahil2801
|
63 |
+
- GPT4-LLM and Unnatural Instructions, by Microsoft
|
64 |
+
|
65 |
+
|
66 |
+
**OpenHermes-Lllama-3B** has improved very much improved coding abilities and general instruction following than the OpenLlama-3B base model. You can see the before and after outputs comparison below.
|
67 |
+
|
68 |
+
|
69 |
+
# Example Outputs
|
70 |
+
|
71 |
+
### Writing Code
|
72 |
+
|
73 |
+
**Q:** Write a Fast API endpoint with ratelimiting
|
74 |
+
|
75 |
+
**OpenHermes-Llama-3B Output**
|
76 |
+
|
77 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/648a374f00f7a3374ee64b99/ZZxphP0JN1Y6HfcGDRwJ1.png" width="600" height="600">
|
78 |
+
|
79 |
+
**Base Model Output**
|
80 |
+
|
81 |
+
*Create a new FastAPI app
|
82 |
+
|
83 |
+
*Create a new endpoint
|
84 |
+
|
85 |
+
*Create a new ratelimit middleware
|
86 |
+
|
87 |
+
*Create a new ratelimit rule
|
88 |
+
|
89 |
+
*Create a new ratelimit backend
|
90 |
+
|
91 |
+
*Create a new ratelimit backend
|
92 |
+
|
93 |
+
### Planning an Itinerary
|
94 |
+
|
95 |
+
**Q:** Create a travel itinery for italy for 3 days for me.
|
96 |
+
|
97 |
+
**OpenHermes-Llama-3B Output**
|
98 |
+
|
99 |
+
Hi,
|
100 |
+
|
101 |
+
I am happy to help you create an itinery for Italy for 3 days.
|
102 |
+
Here is an example itinery:
|
103 |
+
Day 1:
|
104 |
+
|
105 |
+
Morning: Arrive in Rome, check into your hotel.
|
106 |
+
|
107 |
+
Afternoon: Visit the Colosseum, the Roman Forum, and the Pantheon.
|
108 |
+
|
109 |
+
Evening: Have a dinner at a local restaurant and enjoy the nightlife in Rome.
|
110 |
+
|
111 |
+
Day 2:
|
112 |
+
|
113 |
+
Morning: Visit the Vatican Museums, St. Peter's Basilica, and the Sistine Chapel.
|
114 |
+
|
115 |
+
Afternoon: Take a guided tour of the Italian capital.
|
116 |
+
|
117 |
+
Evening: Have a dinner at a local restaurant and enjoy the nightlife in Rome.
|
118 |
+
|
119 |
+
Day 3:
|
120 |
+
|
121 |
+
Morning: Visit the Fontana di Trevi, the Pantheon, and the Colosseum.
|
122 |
+
|
123 |
+
Afternoon: Take a guided tour of the Italian capital.
|
124 |
+
|
125 |
+
Evening: Have a dinner at a local restaurant and enjoy the nightlife in Rome.
|
126 |
+
|
127 |
+
You can also add visits to other attractions such as the Capitoline Museums, the Palatine Hill, and the Spanish Steps.
|
128 |
+
|
129 |
+
I hope this itinery helps you plan your trip to Italy. Let me know if you have any questions or if you need any further assistance.
|
130 |
+
|
131 |
+
**Base Model Output**
|
132 |
+
|
133 |
+
Hi,
|
134 |
+
I am a travel agent and I can help you with your travel itinerary.
|
135 |
+
I can help you with the following:
|
136 |
+
1. Flight booking
|
137 |
+
2. Hotel booking
|
138 |
+
3. Car rental
|
139 |
+
4. Tour booking
|
140 |
+
5. Visa
|
141 |
+
6. Travel insurance
|
142 |
+
7. Airport transfers
|
143 |
+
8. Airport pick up
|
144 |
+
9. Airport drop off
|
145 |
+
10. Airport lounge
|
146 |
+
11. Airport lounge pick up
|
147 |
+
12. Airport lounge drop off
|
148 |
+
|
149 |
+
### Explaining Concepts
|
150 |
+
|
151 |
+
**Q:** Explain golang structs
|
152 |
+
|
153 |
+
**OpenHermes-Llama-3B Output**
|
154 |
+
|
155 |
+
A golang struct is a way to define a data type in Go. It is a collection of fields, each of which is a value of a specific type. The fields are arranged in a hierarchy and are connected using interfaces.
|
156 |
+
Example:
|
157 |
+
|
158 |
+
```go
|
159 |
+
type User struct {
|
160 |
+
Name string
|
161 |
+
Age int
|
162 |
+
Email string
|
163 |
+
}
|
164 |
+
```
|
165 |
+
In this example, the User struct contains a Name field, an Age field, an Email field, and a few other fields. The Name field is of type string, the Age field is of type int, and the Email field is of type string. The User struct is connected to other structs through interfaces, such as the interface for the User type.
|
166 |
+
The main advantage of structs in Go is that they can be used to represent data in a way that is more flexible and easier to use than traditional arrays or maps. Structs also allow for more complex data structures to be created without the need for nested arrays or maps.
|
167 |
+
Here's an example of how to use a User struct:
|
168 |
+
package main
|
169 |
+
|
170 |
+
```go
|
171 |
+
import (
|
172 |
+
"fmt"
|
173 |
+
"strings"
|
174 |
+
)
|
175 |
+
|
176 |
+
type User struct {
|
177 |
+
Name string
|
178 |
+
Age int
|
179 |
+
Email string
|
180 |
+
}
|
181 |
+
|
182 |
+
func main() {
|
183 |
+
user1 := User{"John", 25, "john@example.com"}
|
184 |
+
user2 := User{"Mary", 22, "mary@example.com"}
|
185 |
+
|
186 |
+
// Print the users' names and ages
|
187 |
+
fmt.Printf("User 1: %s (%d)\nUser 2: %s (%d)\n", user1.Name, user1.Age, user2.Name, user2.Age)
|
188 |
+
|
189 |
+
// Print the users' emails separated by commas
|
190 |
+
email1 := user1.Email
|
191 |
+
email2 := user2.Email
|
192 |
+
fmt.Printf("User 1's email: %s\nUser 2's email: %s\n", email1, email2)
|
193 |
+
}
|
194 |
+
```
|
195 |
+
In this example, the User struct is used to represent two different users. The Name field is used to store the users' names, the Age field is used to store their ages, and the Email field is used to store their emails. The main function prints out the users' names and ages, along with their emails separated by commas.
|
196 |
+
In summary, a golang struct is a way to define data types in Go. It is a collection of fields, each of which is a value of a specific type. The fields are arranged in a hierarchy and are connected through interfaces. The main advantage of structs in Go is that they can be used to represent data in a way that is more flexible and easier to use than traditional arrays or maps.
|
197 |
+
|
198 |
+
**Base Model Output**
|
199 |
+
|
200 |
+
*A struct is a named collection of variables.
|
201 |
+
|
202 |
+
# Benchmark Comparison
|
203 |
+
## HumanEval
|
204 |
+
|
205 |
+
| Model | pass@1 |
|
206 |
+
| ------------------------------ | ------ |
|
207 |
+
| openllama-3b-v2 (base) | 8.1% |
|
208 |
+
| openhermes-llama-3b (finetune) | 9.8% |
|