Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
base_model: Qwen/Qwen2.5-
|
3 |
tags:
|
4 |
- text-generation-inference
|
5 |
- transformers
|
@@ -19,16 +19,16 @@ pipeline_tag: text-generation
|
|
19 |
library_name: transformers
|
20 |
---
|
21 |
|
22 |
-
![Zunich Banner](https://cdn.ruben-roy.com/AI/Zurich/img/banner-
|
23 |
|
24 |
-
# Zurich
|
25 |
*A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*
|
26 |
|
27 |
## Overview
|
28 |
-
Zurich
|
29 |
|
30 |
## Model Details
|
31 |
-
- **Base Model:** [Qwen/Qwen2.5-
|
32 |
- **Type:** Causal Language Models
|
33 |
- **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
34 |
- **Number of Parameters:** 7.61B
|
@@ -38,7 +38,7 @@ Zurich 1.5B GammaCorpus v2-10k is a fine-tune of Alibaba's **Qwen 2.5 1.5B Instr
|
|
38 |
|
39 |
## Training Details
|
40 |
|
41 |
-
Zurich-
|
42 |
|
43 |
## Usage
|
44 |
|
@@ -57,7 +57,7 @@ Here is a code snippet with `apply_chat_template` to show you how to load the to
|
|
57 |
```python
|
58 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
59 |
|
60 |
-
model_name = "rubenroy/Zurich-
|
61 |
|
62 |
model = AutoModelForCausalLM.from_pretrained(
|
63 |
model_name,
|
@@ -68,7 +68,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name)
|
|
68 |
|
69 |
prompt = "How tall is the Eiffel tower?"
|
70 |
messages = [
|
71 |
-
{"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5
|
72 |
{"role": "user", "content": prompt}
|
73 |
]
|
74 |
text = tokenizer.apply_chat_template(
|
|
|
1 |
---
|
2 |
+
base_model: Qwen/Qwen2.5-7B-Instruct
|
3 |
tags:
|
4 |
- text-generation-inference
|
5 |
- transformers
|
|
|
19 |
library_name: transformers
|
20 |
---
|
21 |
|
22 |
+
![Zunich Banner](https://cdn.ruben-roy.com/AI/Zurich/img/banner-7B-10k.png)
|
23 |
|
24 |
+
# Zurich 7B GammaCorpus v2-10k
|
25 |
*A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*
|
26 |
|
27 |
## Overview
|
28 |
+
Zurich 7B GammaCorpus v2-10k is a fine-tune of Alibaba's **Qwen 2.5 7B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-10k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-10k).
|
29 |
|
30 |
## Model Details
|
31 |
+
- **Base Model:** [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
|
32 |
- **Type:** Causal Language Models
|
33 |
- **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
34 |
- **Number of Parameters:** 7.61B
|
|
|
38 |
|
39 |
## Training Details
|
40 |
|
41 |
+
Zurich-7B-GCv2-10k underwent fine-tuning with 1 T4 GPU for ~20 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-7B-GCv2-10k was trained for **60 Epochs**.
|
42 |
|
43 |
## Usage
|
44 |
|
|
|
57 |
```python
|
58 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
59 |
|
60 |
+
model_name = "rubenroy/Zurich-7B-GCv2-10k"
|
61 |
|
62 |
model = AutoModelForCausalLM.from_pretrained(
|
63 |
model_name,
|
|
|
68 |
|
69 |
prompt = "How tall is the Eiffel tower?"
|
70 |
messages = [
|
71 |
+
{"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 7B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
|
72 |
{"role": "user", "content": prompt}
|
73 |
]
|
74 |
text = tokenizer.apply_chat_template(
|