noahkim commited on
Commit
e9a3236
โ€ข
1 Parent(s): 36d93aa

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -33
README.md CHANGED
@@ -1,46 +1,59 @@
1
  ---
2
- language: ko
3
  tags:
4
- - summarization
5
- - bigbird
6
- - bart
7
- inference: false
8
-
9
  ---
10
- - This model is a [monologg/kobigbird-bert-base](https://huggingface.co/monologg/kobigbird-bert-base), [ainize/kobart-news](https://huggingface.co/ainize/kobart-news) finetuned on the [daekeun-ml/naver-news-summarization-ko](https://huggingface.co/datasets/daekeun-ml/naver-news-summarization-ko)
11
 
12
- <<20220917 Commit>>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- ๊ฐœ์ธ ์Šคํ„ฐ๋””์šฉ์œผ๋กœ ๊ธด ๋ฌธ์žฅ(๋‰ด์Šค ๋“ฑ)์˜ ์š”์•ฝ ๋ชจ๋ธ ํŠนํ™”๋œ ๋ชจ๋ธ์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด BERT๊ธฐ๋ฐ˜์˜ KoBigBird ๋ชจ๋ธ์„ Encoder Decoder๋กœ ๋ณ€ํ™˜ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
15
- ๊ธฐ์กด์˜ monologg๋‹˜์˜ KoBigBird๋Š” BERT๊ธฐ๋ฐ˜์œผ๋กœ ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ์ž๋ž‘ํ•˜์ง€๋งŒ ์ƒ์„ฑ ์š”์•ฝ ๋ถ€๋ถ„์— ์žˆ์–ด์„œ๋Š” Decoder๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ถ”๊ฐ€์ ์œผ๋กœ Decoder๋ฅผ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.
16
 
17
- ๋งŒ๋“ค์—ˆ๋˜ ์ดˆ๊ธฐ ๋ชจ๋ธ์€ KoBigBird์˜ Encoder๋ฅผ Decoder๋กœ ํ™œ์šฉํ•˜์—ฌ ๋งŒ๋“œ์—ˆ์Šต๋‹ˆ๋‹ค๋งŒ, ์ž์ž˜ํ•œ ์˜ค๋ฅ˜๋กœ ์ธํ•˜์—ฌ monologg๋‹˜์˜ KoBigBird-bert-base์˜ Encoder ๋ถ€๋ถ„๊ณผ ainize๋‹˜์˜ KoBART-news์˜ Decoder๋ฅผ ์ด์–ด์„œ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. config ์ˆ˜์ • ๋“ฑ hyper-parameter
18
- finetuned ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ daekeun-ml๋‹˜์ด ์ œ๊ณตํ•ด์ฃผ์‹  naver-news-summarization-ko ๋ฐ์ดํ„ฐ์…‹์„ ํ™œ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
19
 
20
- ์ดํ›„ AIํ—ˆ๋ธŒ์—์„œ ์ œ๊ณตํ•˜๋Š” ์š”์•ฝ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ์ถ”๊ฐ€ ํ•™์Šต ์ง„ํ–‰ ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.
21
 
22
- ์„ฑ๋Šฅ๋„ ๋งŽ์ด ์•ˆ์ข‹๊ณ  ์ด์ƒํ•˜์ง€๋งŒ, ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ์— ๋Œ€ํ•ด์„œ ๊ด€์‹ฌ๋„ ์žˆ๊ณ  ์ œ๋Œ€๋กœ ํ™œ์šฉํ•˜๊ณ  ์‹ถ์–ด ์Šค์Šค๋กœ ๋งŒ๋“ค์–ด๋ณด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
23
- ์ง€์†์ ์œผ๋กœ ๋ฐœ์ „์‹œ์ผœ ์ข‹์€ ์„ฑ๋Šฅ์˜ ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
24
- ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
 
 
 
 
 
 
25
 
26
- <pre><code>
27
- # Python Code
28
- from transformers import AutoTokenizer
29
- from transformers import AutoModelForSeq2SeqLM
30
 
31
- tokenizer = AutoTokenizer.from_pretrained("noahkim/KoBigBird-KoBart-News-Summarization")
32
- model = AutoModelForSeq2SeqLM.from_pretrained("noahkim/KoBigBird-KoBart-News-Summarization")
33
- </pre></code>
 
 
 
34
 
35
 
 
36
 
37
- @software{jangwon_park_2021_5654154,
38
- author = {Jangwon Park and Donggyu Kim},
39
- title = {KoBigBird: Pretrained BigBird Model for Korean},
40
- month = nov,
41
- year = 2021,
42
- publisher = {Zenodo},
43
- version = {1.0.0},
44
- doi = {10.5281/zenodo.5654154},
45
- url = {https://doi.org/10.5281/zenodo.5654154}
46
- }
 
1
  ---
 
2
  tags:
3
+ - generated_from_trainer
4
+ model-index:
5
+ - name: KoBigBird-KoBart-News-Summarization
6
+ results: []
 
7
  ---
 
8
 
9
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
+ should probably proofread and complete it, then remove this comment. -->
11
+
12
+ # KoBigBird-KoBart-News-Summarization
13
+
14
+ This model is a fine-tuned version of [noahkim/KoBigBird-KoBart-News-Summarization](https://huggingface.co/noahkim/KoBigBird-KoBart-News-Summarization) on an unknown dataset.
15
+ It achieves the following results on the evaluation set:
16
+ - Loss: 4.1236
17
+
18
+ ## Model description
19
+
20
+ More information needed
21
+
22
+ ## Intended uses & limitations
23
+
24
+ More information needed
25
+
26
+ ## Training and evaluation data
27
 
28
+ More information needed
 
29
 
30
+ ## Training procedure
 
31
 
32
+ ### Training hyperparameters
33
 
34
+ The following hyperparameters were used during training:
35
+ - learning_rate: 2e-05
36
+ - train_batch_size: 16
37
+ - eval_batch_size: 16
38
+ - seed: 42
39
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
40
+ - lr_scheduler_type: linear
41
+ - num_epochs: 4
42
+ - mixed_precision_training: Native AMP
43
 
44
+ ### Training results
 
 
 
45
 
46
+ | Training Loss | Epoch | Step | Validation Loss |
47
+ |:-------------:|:-----:|:----:|:---------------:|
48
+ | 4.0748 | 1.0 | 1388 | 4.3067 |
49
+ | 3.8457 | 2.0 | 2776 | 4.2039 |
50
+ | 3.7459 | 3.0 | 4164 | 4.1433 |
51
+ | 3.6773 | 4.0 | 5552 | 4.1236 |
52
 
53
 
54
+ ### Framework versions
55
 
56
+ - Transformers 4.24.0
57
+ - Pytorch 1.12.1+cu113
58
+ - Datasets 2.6.1
59
+ - Tokenizers 0.13.2