rombodawg
/

Rombos-LLM-V2.5.1-Qwen-3b

Model card Files Files and versions Community

rombodawg commited on Oct 8, 2024

Commit

124d3a0

·

verified ·

1 Parent(s): d5f355b

Update README.md

Files changed (1) hide show

README.md +40 -5

README.md CHANGED Viewed

@@ -1,5 +1,40 @@
----
-license: other
-license_name: qwen-research
-license_link: https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE
----

+---
+base_model:
+- Qwen/Qwen2.5-3B
+- Qwen/Qwen2.5-3B-Instruct
+- arcee-ai/raspberry-3B
+license: other
+license_name: qwen-research
+license_link: https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE
+tags:
+  - mergekit
+  - merge
+---
+# Rombos-LLM-V2.5.1-Qwen-3b
+![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/pNDtgE5FDkxxvbG4qiZ1A.jpeg)
+A little experiment I threw together to take a really high quality LLM I found (arcee-ai/raspberry-3B) and merge it using the last step of my Continuous Finetuning method outlines in the paper linked bellow.
+https://docs.google.com/document/d/1OjbjU5AOz4Ftn9xHQrX3oFQGhQ6RDUuXQipnQ9gn6tU/edit?usp=sharing
+Mergekit.yaml file is as follows:
+```yaml
+models:
+  - model: Qwen2.5-3B-Instruct
+    parameters:
+      weight: 1
+      density: 1
+  - model: raspberry-3B
+    parameters:
+      weight: 1
+      density: 1
+merge_method: ties
+base_model: Qwen2.5-3B
+parameters:
+  weight: 1
+  density: 1
+  normalize: true
+  int8_mask: true
+dtype: bfloat16
+```