Commit
•
ce7b25b
1
Parent(s):
ca6af82
Update README.md
Browse files
README.md
CHANGED
@@ -3,9 +3,15 @@ base_model: Alignment-Lab-AI/Neural-network-medium-untuned-theta
|
|
3 |
tags:
|
4 |
- axolotl
|
5 |
- Alignment-Lab-AI
|
|
|
6 |
model-index:
|
7 |
-
- name: Buzz-
|
8 |
results: []
|
|
|
|
|
|
|
|
|
|
|
9 |
---
|
10 |
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
|
11 |
|
@@ -13,25 +19,26 @@ model-index:
|
|
13 |
|
14 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6436279eaaef013d1af225c9/fWaQucBWfabfnMsAFN8hv.png)
|
15 |
|
16 |
-
# Buzz-
|
17 |
|
18 |
## Introduction
|
19 |
|
20 |
- [Alignment Lab AI](https://AlignmentLab.ai) is pleased to introduce our latest research efforts with:
|
21 |
|
22 |
-
**Buzz-
|
23 |
|
24 |
The Buzz model, Dataset, and Code are to be released to build a toolkit that aims to demonstrate the potential for reuse and optimization of existing pretrained language models to continuously refine the heights of performance that can be achieved with optimal use of FlOps. Alongside Buzz-5b-Medium, we release
|
25 |
|
26 |
-
- [The Buzz Dataset](https://huggingface.co/datasets/
|
27 |
-
- [Buzz-2.5b-Small]
|
|
|
28 |
- [Buzz-8B-Large](https://huggingface.co/tempbuzz/Lab-AI/Buzz-8B-Large)
|
29 |
|
30 |
-
the **Buzz dataset** and two additional models: **Buzz-2.5B-Small**
|
31 |
|
32 |
## Performance
|
33 |
|
34 |
-
Buzz-
|
35 |
|
36 |
[ benchmark scores table here]
|
37 |
|
@@ -51,37 +58,42 @@ By combining high quality data, iterative fine-tuning with carefully selected "g
|
|
51 |
|
52 |
|
53 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6436279eaaef013d1af225c9/wyHyDIJnNmbomonZKQAD0.png)
|
54 |
-
https://wandb.ai/llm_surgery/llama-3-8b-vs-5b
|
55 |
-
https://wandb.ai/autometa/neural-network-1
|
56 |
-
https://wandb.ai/autometa/buzz-baby?nw=nwuserautometa
|
57 |
-
https://wandb.ai/autometa/buzz-brother?nw=nwuserautometa
|
58 |
-
https://wandb.ai/autometa/buzz-big?nw=nwuserautometa
|
59 |
|
60 |
## Chat Template and Inference
|
61 |
|
62 |
-
To use the Buzz-
|
63 |
```python
|
64 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
65 |
|
66 |
-
|
|
|
67 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
68 |
model = AutoModelForCausalLM.from_pretrained(model_name)
|
69 |
|
70 |
-
|
|
|
|
|
71 |
|
72 |
-
|
73 |
-
|
74 |
-
{"role": "assistant", "content": "I'm doing well, thank you for asking! How can I assist you today?"},
|
75 |
-
{"role": "user", "content": "Can you tell me a joke?"}
|
76 |
-
]
|
77 |
|
78 |
-
|
79 |
-
input_ids = tokenizer.encode(
|
80 |
|
81 |
-
|
82 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
83 |
|
84 |
-
|
|
|
|
|
|
|
|
|
85 |
``````
|
86 |
## Conclusion
|
87 |
|
@@ -152,4 +164,4 @@ as well as many, many others who are too numerous to name.
|
|
152 |
archivePrefix={arXiv},
|
153 |
primaryClass={cs.CL}
|
154 |
}
|
155 |
-
```
|
|
|
3 |
tags:
|
4 |
- axolotl
|
5 |
- Alignment-Lab-AI
|
6 |
+
- Meta-Llama-3
|
7 |
model-index:
|
8 |
+
- name: Buzz-8b-Large-0.5
|
9 |
results: []
|
10 |
+
license: apache-2.0
|
11 |
+
datasets:
|
12 |
+
- H-D-T/Buzz
|
13 |
+
language:
|
14 |
+
- en
|
15 |
---
|
16 |
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
|
17 |
|
|
|
19 |
|
20 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6436279eaaef013d1af225c9/fWaQucBWfabfnMsAFN8hv.png)
|
21 |
|
22 |
+
# Buzz-8b-Large: Advancing Efficiency through Iterative Fine-Tuning
|
23 |
|
24 |
## Introduction
|
25 |
|
26 |
- [Alignment Lab AI](https://AlignmentLab.ai) is pleased to introduce our latest research efforts with:
|
27 |
|
28 |
+
**Buzz-8b-Large**, a state-of-the-art language model developed in collaboration with [Hive Digital Technologies](https://hivedt.com/).
|
29 |
|
30 |
The Buzz model, Dataset, and Code are to be released to build a toolkit that aims to demonstrate the potential for reuse and optimization of existing pretrained language models to continuously refine the heights of performance that can be achieved with optimal use of FlOps. Alongside Buzz-5b-Medium, we release
|
31 |
|
32 |
+
- [The Buzz Dataset](https://huggingface.co/datasets/H-D-T/Buzz)
|
33 |
+
- [Buzz-2.5b-Small] soon!
|
34 |
+
- [Buzz-5b-Medium] soon!
|
35 |
- [Buzz-8B-Large](https://huggingface.co/tempbuzz/Lab-AI/Buzz-8B-Large)
|
36 |
|
37 |
+
the **Buzz dataset** and two additional models: **Buzz-2.5B-Small** and **Buzz-5B-Medium**, the codebase to refine, filter and augment the data, as well as prune and train your own variants, will additionally be released in the coming days.
|
38 |
|
39 |
## Performance
|
40 |
|
41 |
+
Buzz-8b-Large achieves remarkably low train and validation loss, with unseen data loss reaching around **0.5** by the end of training. This performance showcases the effectiveness of our novel iterative fine-tuning approach, which maximizes the reuse of pretrained weights. Even the smallest variant, Buzz-Small, maintains a steady train loss of approximately **0.4-0.6**, on entirely new data and hold out sets.
|
42 |
|
43 |
[ benchmark scores table here]
|
44 |
|
|
|
58 |
|
59 |
|
60 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6436279eaaef013d1af225c9/wyHyDIJnNmbomonZKQAD0.png)
|
|
|
|
|
|
|
|
|
|
|
61 |
|
62 |
## Chat Template and Inference
|
63 |
|
64 |
+
To use the Buzz-8b-Medium model for chat-based tasks, you can utilize the provided chat template. Here's an example of how to format the chat template and perform inference using the Hugging Face Transformers library:
|
65 |
```python
|
66 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
67 |
|
68 |
+
# Load the tokenizer and model
|
69 |
+
model_name = "H-D-T/Buzz-8b-Large-v0.5"
|
70 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
71 |
model = AutoModelForCausalLM.from_pretrained(model_name)
|
72 |
|
73 |
+
# Set the device to run the model on (e.g., "cuda" for GPU, "cpu" for CPU)
|
74 |
+
device = "cuda" if torch.cuda.is_available() else "cpu"
|
75 |
+
model.to(device)
|
76 |
|
77 |
+
# Define the input prompt
|
78 |
+
prompt = "Hello, how are you today?"
|
|
|
|
|
|
|
79 |
|
80 |
+
# Tokenize the input prompt
|
81 |
+
input_ids = tokenizer.encode(prompt, return_tensors="pt").to(device)
|
82 |
|
83 |
+
# Generate the model's response
|
84 |
+
output = model.generate(
|
85 |
+
input_ids,
|
86 |
+
max_length=100,
|
87 |
+
num_return_sequences=1,
|
88 |
+
no_repeat_ngram_size=2,
|
89 |
+
early_stopping=True
|
90 |
+
)
|
91 |
|
92 |
+
# Decode the generated response
|
93 |
+
response = tokenizer.decode(output[0], skip_special_tokens=True)
|
94 |
+
|
95 |
+
print("Input:", prompt)
|
96 |
+
print("Response:", response)
|
97 |
``````
|
98 |
## Conclusion
|
99 |
|
|
|
164 |
archivePrefix={arXiv},
|
165 |
primaryClass={cs.CL}
|
166 |
}
|
167 |
+
```
|