Yeon-Su Lee commited on
Commit
f67e878
โ€ข
1 Parent(s): d56794e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -7
README.md CHANGED
@@ -1,14 +1,58 @@
1
  ---
2
- language:
3
- - ko
 
 
 
 
 
 
 
4
  base_model:
5
  - google/gemma-2-2b-it
6
  ---
7
 
8
- ## ๋ชจ๋ธ ๊ฐœ์š”
9
 
10
- ์ด ๋ชจ๋ธ์€ AI ํ—ˆ๋ธŒ์˜ ๋ฒ•๋ฅ ์•ˆ ๊ฒ€ํ†  ๋ณด๊ณ ์„œ ์š”์•ฝ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฏธ์„ธ ์กฐ์ •๋œ ๋ชจ๋ธ๋กœ, ํ•œ๊ตญ์–ด ๋ฒ•๋ฅ  ๋ฌธ์„œ๋ฅผ ์š”์•ฝํ•˜๋Š” ๋ฐ ์ตœ์ ํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
11
- Gemma 2B ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ถ”๊ฐ€ ํ•™์Šต์„ ํ†ตํ•ด ๋ฒ•๋ฅ  ๋ฌธ์„œ ์š”์•ฝ ์ž‘์—…์— ์ ํ•ฉํ•˜๊ฒŒ ์กฐ์ •๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
12
 
13
- ## ํ•™์Šต ๋ฐ์ดํ„ฐ
14
- AI hub์˜ ๋ฒ•๋ฅ ์•ˆ ๊ฒ€ํ†  ๋ณด๊ณ ์„œ ์š”์•ฝ ๋ฐ์ดํ„ฐ(https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&dataSetSn=71794) ๋ฅผ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: ko
3
+ license: apache-2.0
4
+ tags:
5
+ - summarization
6
+ - legal
7
+ - korean
8
+ datasets:
9
+ - ai-hub
10
+ model_name: gemma-2b-it-sum-ko-legal
11
  base_model:
12
  - google/gemma-2-2b-it
13
  ---
14
 
15
+ # Gemma-2B-it-sum-ko-legal
16
 
17
+ ## ๋ชจ๋ธ ์„ค๋ช…
 
18
 
19
+ **Gemma-2B-it-sum-ko-legal**์€ AI ํ—ˆ๋ธŒ์˜ **๋ฒ•๋ฅ ์•ˆ ๊ฒ€ํ†  ๋ณด๊ณ ์„œ ์š”์•ฝ ๋ฐ์ดํ„ฐ์…‹**์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ๋ฒ•๋ฅ  ๋ฌธ์„œ, ๋ฒ•๋ฅ ์•ˆ ๊ฒ€ํ†  ๋ณด๊ณ ์„œ์™€ ๊ฐ™์€ ํ•œ๊ตญ์–ด ๋ฌธ์„œ๋ฅผ ๊ฐ„๊ฒฐํ•˜๊ฒŒ ์š”์•ฝํ•˜๋Š” ๋ฐ ํŠนํ™”๋˜์–ด ์žˆ์œผ๋ฉฐ, Hugging Face์˜ ์‚ฌ์ „ ํ•™์Šต๋œ **Gemma 2B** ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ฏธ์„ธ ์กฐ์ •๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ธด ๋ฒ•๋ฅ  ๋ฌธ์„œ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ํ•ต์‹ฌ ๋‚ด์šฉ์„ ์ž๋™์œผ๋กœ ์ถ”์ถœํ•˜์—ฌ ๋ฒ•๋ฅ  ์ „๋ฌธ๊ฐ€๋“ค์ด ๋” ๋น ๋ฅด๊ณ  ํšจ์œจ์ ์œผ๋กœ ๋ฌธ์„œ๋ฅผ ๊ฒ€ํ† ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋•์Šต๋‹ˆ๋‹ค.
20
+
21
+ - **์ง€์› ์–ธ์–ด**: ํ•œ๊ตญ์–ด
22
+ - **ํŠน์ง•**: ๋ฒ•๋ฅ  ๋ฌธ์„œ ์š”์•ฝ์— ์ตœ์ ํ™”
23
+
24
+ ## ๋ชจ๋ธ ํ•™์Šต ๊ณผ์ •
25
+
26
+ ### ๋ฐ์ดํ„ฐ์…‹
27
+
28
+ ์ด ๋ชจ๋ธ์€ **AI ํ—ˆ๋ธŒ์˜ ๋ฒ•๋ฅ ์•ˆ ๊ฒ€ํ†  ๋ณด๊ณ ์„œ ์š”์•ฝ ๋ฐ์ดํ„ฐ์…‹**์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ํ•ด๋‹น ๋ฐ์ดํ„ฐ์…‹์€ ๋ฒ•๋ฅ  ๋ฌธ์„œ์˜ ๊ตฌ์กฐ์™€ ๋‚ด์šฉ์„ ์ดํ•ดํ•˜๊ณ  ์š”์•ฝํ•˜๋Š” ๋ฐ ์ ํ•ฉํ•œ ๋ฐ์ดํ„ฐ๋กœ, ์—ฌ๋Ÿฌ ๋ฒ•๋ฅ  ์ฃผ์ œ๋ฅผ ํฌ๊ด„ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
29
+
30
+ ### ํ•™์Šต ๋ฐฉ๋ฒ•
31
+
32
+ ๋ชจ๋ธ์€ Hugging Face์˜ **Gemma 2B** ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฏธ์„ธ ์กฐ์ •๋˜์—ˆ์œผ๋ฉฐ, ๋ฒ•๋ฅ  ๋ฌธ์„œ์˜ ํŠน์ˆ˜์„ฑ์„ ๋ฐ˜์˜ํ•œ ์ถ”๊ฐ€ ํ•™์Šต์„ ํ†ตํ•ด ์ตœ์ ํ™”๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ ํ•™์Šต์—๋Š” **FP16 ํ˜ผํ•ฉ ์ •๋ฐ€๋„ ํ•™์Šต**์ด ์‚ฌ์šฉ๋˜์—ˆ์œผ๋ฉฐ, ์ฃผ์š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค:
33
+
34
+ - **๋ฐฐ์น˜ ํฌ๊ธฐ**: 16
35
+ - **ํ•™์Šต๋ฅ **: 5e-5
36
+ - **์ตœ์ ํ™” ๊ธฐ๋ฒ•**: AdamW
37
+ - **ํ•™์Šต ์—ํญ**: 3
38
+ - **ํ•˜๋“œ์›จ์–ด**: NVIDIA A100 GPU
39
+
40
+ ## ์ฝ”๋“œ ์˜ˆ์‹œ
41
+
42
+ ์•„๋ž˜ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด ๋ชจ๋ธ์„ ๋กœ๋“œํ•˜๊ณ  ํ•œ๊ตญ์–ด ๋ฒ•๋ฅ  ๋ฌธ์„œ๋ฅผ ์š”์•ฝํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
43
+
44
+ ```python
45
+ from transformers import pipeline
46
+
47
+ # ๋ชจ๋ธ ๋ฐ ํ† ํฌ๋‚˜์ด์ € ๋กœ๋“œ
48
+ pipe_finetuned = pipeline("text-generation", model="your-username/gemma-2b-it-sum-ko-legal", tokenizer="your-username/gemma-2b-it-sum-ko-legal", max_new_tokens=512)
49
+
50
+ # ์š”์•ฝํ•  ํ…์ŠคํŠธ ์ž…๋ ฅ
51
+ paragraph = """
52
+ ํ•œ๊ตญ์˜ ๋ฒ•๋ฅ ์•ˆ ๊ฒ€ํ†  ๋ณด๊ณ ์„œ ๋‚ด์šฉ์€ ๋งค์šฐ ๋ณต์žกํ•˜๊ณ  ๊ธด ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค.
53
+ ์ด๋Ÿฌํ•œ ๋ฌธ์„œ๋ฅผ ์š”์•ฝํ•˜์—ฌ ์ฃผ์š” ์ •๋ณด๋ฅผ ๋น ๋ฅด๊ฒŒ ํŒŒ์•…ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.
54
+ """
55
+
56
+ # ์š”์•ฝ ์š”์ฒญ
57
+ summary = pipe_finetuned(paragraph, do_sample=True, temperature=0.2, top_k=50, top_p=0.95)
58
+ print(summary[0]["generated_text"])