t5-small-thaisum-512
This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.6822
- Rouge1: 0.2353
- Rouge2: 0.098
- Rougel: 0.231
- Rougelsum: 0.2327
- Gen Len: 17.5575
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 200 | 0.5586 | 0.0115 | 0.0033 | 0.0115 | 0.0115 | 19.0 |
No log | 2.0 | 400 | 0.5452 | 0.0231 | 0.0062 | 0.0231 | 0.0234 | 18.9825 |
0.5769 | 3.0 | 600 | 0.5324 | 0.0423 | 0.0205 | 0.0414 | 0.0416 | 18.7475 |
0.5769 | 4.0 | 800 | 0.5295 | 0.0224 | 0.0058 | 0.0221 | 0.0228 | 18.915 |
0.472 | 5.0 | 1000 | 0.5546 | 0.098 | 0.0451 | 0.0966 | 0.0986 | 18.94 |
0.472 | 6.0 | 1200 | 0.5497 | 0.0473 | 0.0141 | 0.047 | 0.0484 | 18.68 |
0.472 | 7.0 | 1400 | 0.5467 | 0.081 | 0.0362 | 0.0799 | 0.0802 | 18.9725 |
0.4121 | 8.0 | 1600 | 0.5658 | 0.1309 | 0.0562 | 0.1282 | 0.1286 | 18.6925 |
0.4121 | 9.0 | 1800 | 0.5641 | 0.1282 | 0.0549 | 0.1278 | 0.1284 | 18.48 |
0.3662 | 10.0 | 2000 | 0.5760 | 0.1537 | 0.0821 | 0.1525 | 0.1536 | 18.615 |
0.3662 | 11.0 | 2200 | 0.6014 | 0.175 | 0.0735 | 0.1723 | 0.1732 | 18.3675 |
0.3662 | 12.0 | 2400 | 0.6181 | 0.1643 | 0.0699 | 0.1637 | 0.165 | 18.3275 |
0.3231 | 13.0 | 2600 | 0.6355 | 0.1349 | 0.0526 | 0.1331 | 0.1342 | 18.6325 |
0.3231 | 14.0 | 2800 | 0.6370 | 0.1656 | 0.083 | 0.1621 | 0.1637 | 17.5725 |
0.2906 | 15.0 | 3000 | 0.6326 | 0.173 | 0.0756 | 0.1708 | 0.1729 | 17.925 |
0.2906 | 16.0 | 3200 | 0.6877 | 0.1719 | 0.0722 | 0.1675 | 0.1687 | 18.0375 |
0.2906 | 17.0 | 3400 | 0.7144 | 0.2073 | 0.0888 | 0.2026 | 0.2045 | 17.5625 |
0.2511 | 18.0 | 3600 | 0.7148 | 0.1932 | 0.0731 | 0.1904 | 0.1916 | 17.7 |
0.2511 | 19.0 | 3800 | 0.7166 | 0.1969 | 0.0823 | 0.1938 | 0.1941 | 17.835 |
0.2335 | 20.0 | 4000 | 0.7319 | 0.201 | 0.0798 | 0.2001 | 0.2015 | 17.9775 |
0.2335 | 21.0 | 4200 | 0.7522 | 0.2027 | 0.0827 | 0.1999 | 0.2014 | 18.3925 |
0.2335 | 22.0 | 4400 | 0.7960 | 0.1904 | 0.0769 | 0.1867 | 0.188 | 18.045 |
0.209 | 23.0 | 4600 | 0.7966 | 0.2263 | 0.1026 | 0.2247 | 0.226 | 18.0275 |
0.209 | 24.0 | 4800 | 0.8202 | 0.2255 | 0.0924 | 0.2213 | 0.2223 | 17.535 |
0.1899 | 25.0 | 5000 | 0.8518 | 0.2181 | 0.0938 | 0.2159 | 0.2167 | 17.875 |
0.1899 | 26.0 | 5200 | 0.8737 | 0.2246 | 0.1013 | 0.2223 | 0.2231 | 17.385 |
0.1899 | 27.0 | 5400 | 0.8647 | 0.2237 | 0.1025 | 0.2188 | 0.2206 | 17.525 |
0.1722 | 28.0 | 5600 | 0.8828 | 0.2058 | 0.0777 | 0.2014 | 0.2033 | 17.82 |
0.1722 | 29.0 | 5800 | 0.9233 | 0.2164 | 0.0895 | 0.2127 | 0.2131 | 16.915 |
0.1559 | 30.0 | 6000 | 0.9197 | 0.2141 | 0.0898 | 0.206 | 0.2074 | 16.7375 |
0.1559 | 31.0 | 6200 | 0.9224 | 0.215 | 0.0951 | 0.2138 | 0.2148 | 17.9725 |
0.1559 | 32.0 | 6400 | 0.9181 | 0.2124 | 0.0837 | 0.2081 | 0.2091 | 17.5275 |
0.1443 | 33.0 | 6600 | 0.9495 | 0.2124 | 0.0853 | 0.2095 | 0.2099 | 17.6625 |
0.1443 | 34.0 | 6800 | 0.9250 | 0.1986 | 0.0729 | 0.1951 | 0.1963 | 17.7725 |
0.1355 | 35.0 | 7000 | 0.9943 | 0.1925 | 0.0787 | 0.189 | 0.1905 | 17.46 |
0.1355 | 36.0 | 7200 | 0.9961 | 0.2146 | 0.0904 | 0.2117 | 0.2129 | 16.855 |
0.1355 | 37.0 | 7400 | 0.9963 | 0.2035 | 0.0742 | 0.1989 | 0.1998 | 17.4725 |
0.1256 | 38.0 | 7600 | 1.0356 | 0.2343 | 0.0908 | 0.2274 | 0.2287 | 17.3825 |
0.1256 | 39.0 | 7800 | 1.0512 | 0.2234 | 0.0978 | 0.2187 | 0.2196 | 18.08 |
0.1153 | 40.0 | 8000 | 1.0227 | 0.2321 | 0.0978 | 0.2273 | 0.2284 | 16.855 |
0.1153 | 41.0 | 8200 | 1.0955 | 0.2265 | 0.0928 | 0.2193 | 0.221 | 17.8775 |
0.1153 | 42.0 | 8400 | 1.0699 | 0.2131 | 0.0922 | 0.2084 | 0.2097 | 17.535 |
0.1057 | 43.0 | 8600 | 1.1177 | 0.2375 | 0.0962 | 0.2317 | 0.2325 | 17.5875 |
0.1057 | 44.0 | 8800 | 1.1074 | 0.2473 | 0.1 | 0.2427 | 0.2421 | 17.5875 |
0.1 | 45.0 | 9000 | 1.1022 | 0.2408 | 0.0989 | 0.232 | 0.2345 | 17.22 |
0.1 | 46.0 | 9200 | 1.1364 | 0.2427 | 0.1026 | 0.2362 | 0.2371 | 17.62 |
0.1 | 47.0 | 9400 | 1.0741 | 0.231 | 0.086 | 0.227 | 0.2282 | 17.82 |
0.0947 | 48.0 | 9600 | 1.1516 | 0.2443 | 0.1083 | 0.2385 | 0.2402 | 17.3725 |
0.0947 | 49.0 | 9800 | 1.1216 | 0.2192 | 0.0823 | 0.2142 | 0.2155 | 17.56 |
0.0905 | 50.0 | 10000 | 1.1242 | 0.2215 | 0.0895 | 0.2151 | 0.2155 | 17.3325 |
0.0905 | 51.0 | 10200 | 1.1732 | 0.2142 | 0.0895 | 0.2106 | 0.2119 | 17.055 |
0.0905 | 52.0 | 10400 | 1.1463 | 0.2294 | 0.0991 | 0.2255 | 0.227 | 17.85 |
0.0829 | 53.0 | 10600 | 1.1870 | 0.2167 | 0.091 | 0.2133 | 0.2146 | 17.58 |
0.0829 | 54.0 | 10800 | 1.1741 | 0.2322 | 0.095 | 0.2254 | 0.2275 | 17.2925 |
0.0797 | 55.0 | 11000 | 1.1595 | 0.2234 | 0.0904 | 0.2174 | 0.22 | 17.2625 |
0.0797 | 56.0 | 11200 | 1.2061 | 0.2296 | 0.0982 | 0.2255 | 0.2287 | 17.5525 |
0.0797 | 57.0 | 11400 | 1.2275 | 0.2282 | 0.0924 | 0.2243 | 0.2267 | 17.2825 |
0.0734 | 58.0 | 11600 | 1.2205 | 0.2111 | 0.0821 | 0.2063 | 0.2085 | 17.4775 |
0.0734 | 59.0 | 11800 | 1.2248 | 0.2195 | 0.0901 | 0.2143 | 0.2164 | 17.3475 |
0.0691 | 60.0 | 12000 | 1.2842 | 0.2168 | 0.0821 | 0.2127 | 0.2151 | 17.5925 |
0.0691 | 61.0 | 12200 | 1.2827 | 0.2357 | 0.0999 | 0.2319 | 0.2327 | 17.1875 |
0.0691 | 62.0 | 12400 | 1.3232 | 0.2369 | 0.1 | 0.2309 | 0.2335 | 17.3975 |
0.0669 | 63.0 | 12600 | 1.2934 | 0.2279 | 0.091 | 0.2239 | 0.2255 | 17.7175 |
0.0669 | 64.0 | 12800 | 1.3149 | 0.234 | 0.0915 | 0.2277 | 0.2294 | 17.62 |
0.0602 | 65.0 | 13000 | 1.3568 | 0.2423 | 0.0954 | 0.2362 | 0.2375 | 17.335 |
0.0602 | 66.0 | 13200 | 1.3548 | 0.2439 | 0.0997 | 0.2373 | 0.2397 | 17.4825 |
0.0602 | 67.0 | 13400 | 1.3380 | 0.2287 | 0.0868 | 0.2238 | 0.2259 | 17.64 |
0.0572 | 68.0 | 13600 | 1.3396 | 0.228 | 0.0861 | 0.221 | 0.2231 | 17.3025 |
0.0572 | 69.0 | 13800 | 1.3772 | 0.2375 | 0.099 | 0.2311 | 0.2317 | 17.31 |
0.0528 | 70.0 | 14000 | 1.3955 | 0.2344 | 0.0866 | 0.228 | 0.2298 | 17.3325 |
0.0528 | 71.0 | 14200 | 1.3739 | 0.2231 | 0.0837 | 0.2184 | 0.2203 | 17.4925 |
0.0528 | 72.0 | 14400 | 1.4183 | 0.2357 | 0.0909 | 0.2298 | 0.2313 | 17.41 |
0.0515 | 73.0 | 14600 | 1.4263 | 0.2287 | 0.0889 | 0.224 | 0.2256 | 17.3 |
0.0515 | 74.0 | 14800 | 1.4472 | 0.2427 | 0.0941 | 0.2374 | 0.2383 | 17.4925 |
0.0464 | 75.0 | 15000 | 1.4343 | 0.2279 | 0.091 | 0.2233 | 0.2241 | 17.4475 |
0.0464 | 76.0 | 15200 | 1.4434 | 0.235 | 0.0962 | 0.229 | 0.2301 | 17.3775 |
0.0464 | 77.0 | 15400 | 1.4450 | 0.2352 | 0.0876 | 0.2294 | 0.2312 | 17.3275 |
0.0451 | 78.0 | 15600 | 1.4874 | 0.2425 | 0.093 | 0.2382 | 0.2398 | 17.3125 |
0.0451 | 79.0 | 15800 | 1.4704 | 0.2296 | 0.0826 | 0.2257 | 0.2266 | 17.7475 |
0.0403 | 80.0 | 16000 | 1.5066 | 0.2344 | 0.0935 | 0.2315 | 0.2321 | 17.5 |
0.0403 | 81.0 | 16200 | 1.5247 | 0.2378 | 0.0919 | 0.233 | 0.2342 | 17.4375 |
0.0403 | 82.0 | 16400 | 1.5434 | 0.2347 | 0.0897 | 0.2303 | 0.2316 | 17.69 |
0.0382 | 83.0 | 16600 | 1.5366 | 0.2316 | 0.0986 | 0.2267 | 0.2276 | 17.365 |
0.0382 | 84.0 | 16800 | 1.5463 | 0.2408 | 0.0939 | 0.2371 | 0.2383 | 17.465 |
0.036 | 85.0 | 17000 | 1.5652 | 0.2319 | 0.0939 | 0.2277 | 0.2288 | 17.6075 |
0.036 | 86.0 | 17200 | 1.5848 | 0.2293 | 0.0952 | 0.2246 | 0.2255 | 17.43 |
0.036 | 87.0 | 17400 | 1.6144 | 0.239 | 0.1035 | 0.2343 | 0.2349 | 17.4425 |
0.0332 | 88.0 | 17600 | 1.5723 | 0.2326 | 0.0961 | 0.228 | 0.2299 | 17.495 |
0.0332 | 89.0 | 17800 | 1.5910 | 0.2373 | 0.0998 | 0.231 | 0.2326 | 17.3 |
0.0312 | 90.0 | 18000 | 1.6275 | 0.2362 | 0.0989 | 0.2314 | 0.2325 | 17.5525 |
0.0312 | 91.0 | 18200 | 1.6320 | 0.2367 | 0.0986 | 0.2326 | 0.2336 | 17.4925 |
0.0312 | 92.0 | 18400 | 1.6487 | 0.2353 | 0.0986 | 0.2299 | 0.2321 | 17.465 |
0.0286 | 93.0 | 18600 | 1.6535 | 0.2359 | 0.0979 | 0.2315 | 0.2336 | 17.5575 |
0.0286 | 94.0 | 18800 | 1.6568 | 0.2406 | 0.0977 | 0.2356 | 0.237 | 17.63 |
0.0274 | 95.0 | 19000 | 1.6588 | 0.2359 | 0.0956 | 0.2314 | 0.2339 | 17.5575 |
0.0274 | 96.0 | 19200 | 1.6772 | 0.2442 | 0.0996 | 0.2397 | 0.2411 | 17.5675 |
0.0274 | 97.0 | 19400 | 1.6732 | 0.2344 | 0.0949 | 0.2304 | 0.2313 | 17.4925 |
0.025 | 98.0 | 19600 | 1.6801 | 0.2329 | 0.096 | 0.2287 | 0.2299 | 17.5525 |
0.025 | 99.0 | 19800 | 1.6837 | 0.2349 | 0.098 | 0.2306 | 0.2323 | 17.545 |
0.0245 | 100.0 | 20000 | 1.6822 | 0.2353 | 0.098 | 0.231 | 0.2327 | 17.5575 |
Framework versions
- Transformers 4.29.1
- Pytorch 2.0.0+cu118
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.