Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,9 @@
|
|
1 |
---
|
2 |
-
base_model:
|
|
|
|
|
|
|
|
|
3 |
license: mit
|
4 |
datasets:
|
5 |
- arcee-ai/EvolKit-75K
|
@@ -14,12 +18,11 @@ Experimental commander model V1.
|
|
14 |
|
15 |
Named it Zelensky in order to troll Uncle Elon on twitter over how bad Grok-2 is.
|
16 |
|
17 |
-
Training process, low 1 epoch learning rate and evolutionary-merged via https://github.com/arcee-ai/EvolKit
|
18 |
|
19 |
-
Process on 8x AMD Mi300 192GB gpus.
|
20 |
|
21 |
Thank you Vultr https://www.vultr.com/register/ for sponsoring the compute.
|
22 |
|
23 |
|
24 |
-
|
25 |
-
Qwen License applies by default.
|
|
|
1 |
---
|
2 |
+
base_model:
|
3 |
+
- Qwen/Qwen2.5-72B-Instruct
|
4 |
+
- huihui-ai/Qwen2.5-72B-Instruct-abliterated
|
5 |
+
- Qwen/Qwen2.5-72B
|
6 |
+
- spow12/ChatWaifu_72B_v2.2
|
7 |
license: mit
|
8 |
datasets:
|
9 |
- arcee-ai/EvolKit-75K
|
|
|
18 |
|
19 |
Named it Zelensky in order to troll Uncle Elon on twitter over how bad Grok-2 is.
|
20 |
|
21 |
+
Training process, low 1 epoch learning rate and evolutionary-merged with the 3 other listed models via https://github.com/arcee-ai/EvolKit
|
22 |
|
23 |
+
Process repeated multiple times on 8x AMD Mi300 192GB gpus while also running gpqa_diamond_zeroshot on LM_Eval harness.
|
24 |
|
25 |
Thank you Vultr https://www.vultr.com/register/ for sponsoring the compute.
|
26 |
|
27 |
|
28 |
+
Qwen License still applies by default.
|
|