rombodawg commited on
Commit
124d3a0
·
verified ·
1 Parent(s): d5f355b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -5
README.md CHANGED
@@ -1,5 +1,40 @@
1
- ---
2
- license: other
3
- license_name: qwen-research
4
- license_link: https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-3B
4
+ - Qwen/Qwen2.5-3B-Instruct
5
+ - arcee-ai/raspberry-3B
6
+ license: other
7
+ license_name: qwen-research
8
+ license_link: https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE
9
+ tags:
10
+ - mergekit
11
+ - merge
12
+ ---
13
+ # Rombos-LLM-V2.5.1-Qwen-3b
14
+
15
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/pNDtgE5FDkxxvbG4qiZ1A.jpeg)
16
+
17
+ A little experiment I threw together to take a really high quality LLM I found (arcee-ai/raspberry-3B) and merge it using the last step of my Continuous Finetuning method outlines in the paper linked bellow.
18
+
19
+ https://docs.google.com/document/d/1OjbjU5AOz4Ftn9xHQrX3oFQGhQ6RDUuXQipnQ9gn6tU/edit?usp=sharing
20
+
21
+ Mergekit.yaml file is as follows:
22
+ ```yaml
23
+ models:
24
+ - model: Qwen2.5-3B-Instruct
25
+ parameters:
26
+ weight: 1
27
+ density: 1
28
+ - model: raspberry-3B
29
+ parameters:
30
+ weight: 1
31
+ density: 1
32
+ merge_method: ties
33
+ base_model: Qwen2.5-3B
34
+ parameters:
35
+ weight: 1
36
+ density: 1
37
+ normalize: true
38
+ int8_mask: true
39
+ dtype: bfloat16
40
+ ```