File size: 6,429 Bytes
5498dd9 1eb98d3 2c3d6a5 1eb98d3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
---
license: apache-2.0
---
# pegasus-indonesian-base_pretrained
Github : [PegasusAnthony](https://github.com/nicholaswilven/PEGASUSAnthony/tree/master)
This model is a pretrained version of [pegasus-indonesian-base_finetune](https://huggingface.co/thonyyy/pegasus-indonesian-base_finetune) on [kaggle id news 2017](https://www.kaggle.com/datasets/aashari/indonesian-news-articles-published-at-2017), [CC_News_id](https://github.com/Wikidepia/indonesian_datasets/tree/master/dump/cc-news), and [OSCAR_2201](https://huggingface.co/datasets/oscar-corpus/OSCAR-2201/viewer/id/train).
It achieves the following results on the evaluation set:
- Train Loss: 2.34832262992858
- Train Accuracy: 0.262173235416412
- Validation Loss: 2.34894156455993
- Validation Accuracy: 0.266122311353683
- Train Lr: 0.000136618677061051
- Epoch: 40
## Intended uses & limitations
This model is uncased, can't read special characters except "," and ".", having hard time understanding numbers, and performance only tested on news article text.
## Training and evaluation data
Pretrain dataset:
1. [kaggle id news 2017](https://www.kaggle.com/datasets/aashari/indonesian-news-articles-published-at-2017)
2. [CC_News_id](https://github.com/Wikidepia/indonesian_datasets/tree/master/dump/cc-news)
3. [OSCAR_2201](https://huggingface.co/datasets/oscar-corpus/OSCAR-2201/viewer/id/train)
## Training procedure
For replication, go to GitHub page
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'Adafactor', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': False, 'is_legacy_optimizer': False, 'learning_rate': 0.005, 'beta_2_decay': -0.8, 'epsilon_1': 1e-30, 'epsilon_2': 0.001, 'clip_threshold': 1.0, 'relative_step': True}
- training_precision: float32
```python
configuration.vocab_size = 32103
configuration.d_model = 512
configuration.dropout = 0.15
configuration.decoder_attention_heads = 8
configuration.decoder_layers = 12
configuration.decoder_ffn_dim = 3072
configuration.encoder_attention_heads = 8
configuration.encoder_layers = 12
configuration.encoder_ffn_dim = 3072
```
### Training results
|Train Loss|Train Accuracy|Validation Loss|Validation Accuracy|Train Lr|Epoch|
|:--------:|:------------:|:-------------:|:-----------------:|:------:|:---:|
|4.1939034461975|0.145276814699172|3.39564657211303|0.186678826808929|0.00499999988824129|1|
|3.13256049156188|0.208270609378814|2.82256889343261|0.233325317502021|0.00499999988824129|2|
|2.84938621520996|0.229006066918373|2.72168040275573|0.23955675959587|0.00499999988824129|3|
|2.76001143455505|0.234559893608093|2.65143990516662|0.243813350796699|0.00499999988824129|4|
|2.70404982566833|0.238061532378196|2.6107530593872|0.246574580669403|0.00452418718487024|5|
|2.6638650894165|0.240613579750061|2.57847166061401|0.248678594827651|0.00409365398809313|6|
|2.63293719291687|0.242613524198532|2.55772447586059|0.250325441360473|0.00370409130118787|7|
|2.60750746726989|0.244251564145088|2.53469848632812|0.251805543899536|0.00335160037502646|8|
|2.58670353889465|0.245637223124504|2.51883554458618|0.253003656864166|0.00303265335969626|9|
|2.56865572929382|0.24682830274105|2.49989652633666|0.254459708929061|0.00274405837990343|10|
|2.55285787582397|0.247884958982467|2.50092124938964|0.254229605197906|0.00248292670585215|11|
|2.53919672966003|0.248811900615692|2.47859454154968|0.255691051483154|0.00224664504639804|12|
|2.52694725990295|0.249630719423294|2.46921157836914|0.25649145245552|0.00203284854069352|13|
|2.51587128639221|0.250377029180526|2.46414017677307|0.257025629281997|0.0018393974751234|14|
|2.50599193572998|0.251064419746398|2.4557819366455|0.257613778114318|0.00166435563005507|15|
|2.49690246582031|0.251682370901107|2.44843244552612|0.258032590150833|0.00150597130414098|16|
|2.48859119415283|0.252267301082611|2.43858122825622|0.258764535188674|0.00136265915352851|17|
|2.48097324371337|0.252792716026306|2.43251323699951|0.259270757436752|0.00123298505786806|18|
|2.47009921073913|0.253554105758667|2.43577146530151|0.258938610553741|0.00111565098632127|19|
|2.45849394798278|0.254375785589218|2.42337107658386|0.260090589523315|0.00100948277395218|20|
|2.44776940345764|0.255127549171447|2.41147446632385|0.260682851076126|0.000913417781703174|21|
|2.43759155273437|0.255834341049194|2.41405510902404|0.260819226503372|0.000826494593638926|22|
|2.42819571495056|0.256486028432846|2.40314364433288|0.26152354478836|0.000747843238059431|23|
|2.41974592208862|0.257094115018844|2.39181518554687|0.262460082769393|0.000676676572766155|24|
|2.41181802749633|0.257666647434234|2.3825569152832|0.263035386800766|0.000612282310612499|25|
|2.4044873714447|0.258173674345016|2.37829279899597|0.263585090637207|0.000554015976376831|26|
|2.39774870872497|0.258645176887512|2.37718510627746|0.263547003269195|0.000501294387504458|27|
|2.39184403419494|0.259076595306396|2.37379837036132|0.264020860195159|0.00045358992065303|28|
|2.38593125343322|0.259495466947555|2.37083029747009|0.264293819665908|0.000410425127483904|29|
|2.38093471527099|0.259853214025497|2.36486291885375|0.264451295137405|0.000371368019841611|30|
|2.37621307373046|0.260185241699218|2.36547923088073|0.264706671237945|0.000336027675075456|31|
|2.37177920341491|0.260504961013793|2.3609721660614|0.264981210231781|0.000304050423437729|32|
|2.3679461479187|0.260774314403533|2.36445379257202|0.264800041913986|0.000275116210104897|33|
|2.3643410205841|0.261037856340408|2.3573100566864|0.265379041433334|0.000248935451963916|34|
|2.36092805862426|0.261268675327301|2.36105728149414|0.264868646860122|0.000225246112677268|35|
|2.35798692703247|0.261485010385513|2.35409832000732|0.265503793954849|0.000203811112442053|36|
|2.35523629188537|0.26168617606163|2.35252356529235|0.265713244676589|0.000184415926923975|37|
|2.35284709930419|0.261859744787216|2.35101222991943|0.265856444835662|0.000166866433573886|38|
|2.35047316551208|0.262033462524414|2.34698224067687|0.266099989414215|0.000150986990774981|39|
|2.34832262992858|0.262173235416412|2.34894156455993|0.266122311353683|0.000136618677061051|40|
### Framework versions
- Transformers 4.30.2
- TensorFlow 2.12.0
- Datasets 2.13.1
- Tokenizers 0.13.3
### Special Thanks
Research supported with Cloud TPUs from Google’s TPU Research Cloud (TRC) |