aiqwe commited on
Commit
faa16a9
โ€ข
1 Parent(s): 3e52835

Model save

Browse files
Files changed (1) hide show
  1. README.md +40 -98
README.md CHANGED
@@ -11,107 +11,49 @@ model-index:
11
  results: []
12
  ---
13
 
14
- ---
15
- ## Model Description
16
- [gemma-2b-it ๋ชจ๋ธ](https://huggingface.co/google/gemma-2b-it)์„ Instruction Tuningํ•œ ์˜ˆ์ œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
17
- Instruction Tuning์— ๋Œ€ํ•ด ์‰ฝ๊ฒŒ ๊ณต๋ถ€ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๊ธ€๋กœ๋œ ์˜ˆ์ œ ์ฝ”๋“œ๋ฅผ ์ œ๊ณตํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
18
- **git hub** : [https://github.com/aiqwe/instruction-tuning-with-rag-example](https://github.com/aiqwe/instruction-tuning-with-rag-example)
19
-
20
- ## Usage
21
- ### Inference on GPU example
22
- ```python
23
- from transformers import AutoTokenizer, AutoModelForCausalLM
24
-
25
- tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b-it")
26
- model = AutoModelForCausalLM.from_pretrained(
27
- "aiqwe/gemma-2b-it-example-v1",
28
- device_map="cuda",
29
- torch_dtype=torch.bfloat16,
30
- attn_implementation="flash_attention_2"
31
- )
32
-
33
- input_text = "์•„ํŒŒํŠธ ์žฌ๊ฑด์ถ•์— ๋Œ€ํ•ด ์•Œ๋ ค์ค˜."
34
- input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
35
-
36
- outputs = model.generate(**input_ids, max_new_tokens=512)
37
- print(tokenizer.decode(outputs[0]))
38
-
39
- ```
40
-
41
-
42
- ### Inference on CPU example
43
- ```python
44
- from transformers import AutoTokenizer, AutoModelForCausalLM
45
-
46
- tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b-it")
47
- model = AutoModelForCausalLM.from_pretrained(
48
- "aiqwe/gemma-2b-it-example-v1",
49
- device_map="cpu",
50
- torch_dtype=torch.bfloat16
51
- )
52
-
53
- input_text = "์•„ํŒŒํŠธ ์žฌ๊ฑด์ถ•์— ๋Œ€ํ•ด ์•Œ๋ ค์ค˜."
54
- input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
55
-
56
- outputs = model.generate(**input_ids, max_new_tokens=512)
57
- print(tokenizer.decode(outputs[0]))
58
- ```
59
-
60
- ### Inference on GPU with embedded function example
61
- ๋‚ด์žฅ๋œ ํ•จ์ˆ˜๋กœ ๋„ค์ด๋ฒ„ ๊ฒ€์ƒ‰ API๋ฅผ ํ†ตํ•ด RAG๋ฅผ ์ง€์›๋ฐ›์Šต๋‹ˆ๋‹ค.
62
- ```python
63
- from transformers import AutoTokenizer, AutoModelForCausalLM
64
- from utils import generate
65
-
66
- tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b-it")
67
- model = AutoModelForCausalLM.from_pretrained(
68
- "aiqwe/gemma-2b-it-example-v1",
69
- device_map="cuda",
70
- torch_dtype=torch.bfloat16,
71
- attn_implementation="flash_attention_2"
72
- )
73
-
74
- rag_config = {
75
- "api_client_id": userdata.get('NAVER_API_ID'),
76
- "api_client_secret": userdata.get('NAVER_API_SECRET')
77
- }
78
- completion = generate(
79
- model=model,
80
- tokenizer=tokenizer,
81
- query=query,
82
- max_new_tokens=512,
83
- rag=True,
84
- rag_config=rag_config
85
- )
86
- print(completion)
87
- ```
88
-
89
- ## Chat Template
90
- Gemma ๋ชจ๋ธ์˜ Chat Template์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
91
- [gemma-2b-it Chat Template](https://huggingface.co/google/gemma-2b-it#chat-template)
92
-
93
- ## Training Spec
94
- ํ•™์Šต์€ ๊ตฌ๊ธ€ ์ฝ”๋žฉ L4 Single GPU๋ฅผ ํ™œ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค.
95
-
96
- | ๊ตฌ๋ถ„ | ๋‚ด์šฉ |
97
- |---------------|---------------------|
98
- | ํ•™์Šต ํ™˜๊ฒฝ | Google Colab |
99
- | GPU | L4(22.5GB) |
100
- | ํ•™์Šต์‹œ VRAM | ์•ฝ 17GB ์‚ฌ์šฉ |
101
- | dtype | bfloat16 |
102
- | Attention | flash attention2 |
103
- | Tuning | Lora(r=4, alpha=32) |
104
- | Learning Rate | 5e-5 |
105
- | LRScheduler | Cosine |
106
- | Optimizer | adamw_torch_fused |
107
 
108
  ### Framework versions
109
 
110
  - PEFT 0.10.0
111
  - Transformers 4.40.1
112
  - Pytorch 2.2.1+cu121
113
- - Datasets 2.19.0
114
- - Tokenizers 0.19.1
115
-
116
- ## Github Profile
117
- Github : https://github.com/aiqwe
 
11
  results: []
12
  ---
13
 
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # gemma-2b-it-example-v1
18
+
19
+ This model is a fine-tuned version of [google/gemma-1.1-2b-it](https://huggingface.co/google/gemma-1.1-2b-it) on an unknown dataset.
20
+
21
+ ## Model description
22
+
23
+ More information needed
24
+
25
+ ## Intended uses & limitations
26
+
27
+ More information needed
28
+
29
+ ## Training and evaluation data
30
+
31
+ More information needed
32
+
33
+ ## Training procedure
34
+
35
+ ### Training hyperparameters
36
+
37
+ The following hyperparameters were used during training:
38
+ - learning_rate: 5e-05
39
+ - train_batch_size: 8
40
+ - eval_batch_size: 8
41
+ - seed: 42
42
+ - gradient_accumulation_steps: 2
43
+ - total_train_batch_size: 16
44
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
+ - lr_scheduler_type: cosine
46
+ - lr_scheduler_warmup_ratio: 0.05
47
+ - num_epochs: 20
48
+
49
+ ### Training results
50
+
51
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  ### Framework versions
54
 
55
  - PEFT 0.10.0
56
  - Transformers 4.40.1
57
  - Pytorch 2.2.1+cu121
58
+ - Datasets 2.19.1
59
+ - Tokenizers 0.19.1