rubenroy commited on
Commit
5b3f608
·
verified ·
1 Parent(s): 10c7a40

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- base_model: Qwen/Qwen2.5-1.5B-Instruct
3
  tags:
4
  - text-generation-inference
5
  - transformers
@@ -19,16 +19,16 @@ pipeline_tag: text-generation
19
  library_name: transformers
20
  ---
21
 
22
- ![Zunich Banner](https://cdn.ruben-roy.com/AI/Zurich/img/banner-1.5B-10k.png)
23
 
24
- # Zurich 1.5B GammaCorpus v2-10k
25
  *A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*
26
 
27
  ## Overview
28
- Zurich 1.5B GammaCorpus v2-10k is a fine-tune of Alibaba's **Qwen 2.5 1.5B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-10k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-10k).
29
 
30
  ## Model Details
31
- - **Base Model:** [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
32
  - **Type:** Causal Language Models
33
  - **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
34
  - **Number of Parameters:** 7.61B
@@ -38,7 +38,7 @@ Zurich 1.5B GammaCorpus v2-10k is a fine-tune of Alibaba's **Qwen 2.5 1.5B Instr
38
 
39
  ## Training Details
40
 
41
- Zurich-1.5B-GCv2-10k underwent fine-tuning with 1 A100 GPU for ~5 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-1.5B-GCv2-10k was trained for **60 Epochs**.
42
 
43
  ## Usage
44
 
@@ -57,7 +57,7 @@ Here is a code snippet with `apply_chat_template` to show you how to load the to
57
  ```python
58
  from transformers import AutoModelForCausalLM, AutoTokenizer
59
 
60
- model_name = "rubenroy/Zurich-1.5B-GCv2-10k"
61
 
62
  model = AutoModelForCausalLM.from_pretrained(
63
  model_name,
@@ -68,7 +68,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name)
68
 
69
  prompt = "How tall is the Eiffel tower?"
70
  messages = [
71
- {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 1.5B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
72
  {"role": "user", "content": prompt}
73
  ]
74
  text = tokenizer.apply_chat_template(
 
1
  ---
2
+ base_model: Qwen/Qwen2.5-7B-Instruct
3
  tags:
4
  - text-generation-inference
5
  - transformers
 
19
  library_name: transformers
20
  ---
21
 
22
+ ![Zunich Banner](https://cdn.ruben-roy.com/AI/Zurich/img/banner-7B-10k.png)
23
 
24
+ # Zurich 7B GammaCorpus v2-10k
25
  *A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*
26
 
27
  ## Overview
28
+ Zurich 7B GammaCorpus v2-10k is a fine-tune of Alibaba's **Qwen 2.5 7B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-10k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-10k).
29
 
30
  ## Model Details
31
+ - **Base Model:** [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
32
  - **Type:** Causal Language Models
33
  - **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
34
  - **Number of Parameters:** 7.61B
 
38
 
39
  ## Training Details
40
 
41
+ Zurich-7B-GCv2-10k underwent fine-tuning with 1 T4 GPU for ~20 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-7B-GCv2-10k was trained for **60 Epochs**.
42
 
43
  ## Usage
44
 
 
57
  ```python
58
  from transformers import AutoModelForCausalLM, AutoTokenizer
59
 
60
+ model_name = "rubenroy/Zurich-7B-GCv2-10k"
61
 
62
  model = AutoModelForCausalLM.from_pretrained(
63
  model_name,
 
68
 
69
  prompt = "How tall is the Eiffel tower?"
70
  messages = [
71
+ {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 7B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
72
  {"role": "user", "content": prompt}
73
  ]
74
  text = tokenizer.apply_chat_template(