EricWesthoff commited on
Commit
381bb0f
1 Parent(s): 47cc467

End of training

Browse files
Files changed (1) hide show
  1. README.md +242 -2
README.md CHANGED
@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 1.1735
19
 
20
  ## Model description
21
 
@@ -40,7 +40,7 @@ The following hyperparameters were used during training:
40
  - seed: 42
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: linear
43
- - training_steps: 24000
44
 
45
  ### Training results
46
 
@@ -286,6 +286,246 @@ The following hyperparameters were used during training:
286
  | 1.198 | 19.04 | 23800 | 1.1741 |
287
  | 1.1919 | 19.12 | 23900 | 1.1736 |
288
  | 1.149 | 19.2 | 24000 | 1.1735 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
289
 
290
 
291
  ### Framework versions
 
15
 
16
  This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 0.5914
19
 
20
  ## Model description
21
 
 
40
  - seed: 42
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: linear
43
+ - training_steps: 48000
44
 
45
  ### Training results
46
 
 
286
  | 1.198 | 19.04 | 23800 | 1.1741 |
287
  | 1.1919 | 19.12 | 23900 | 1.1736 |
288
  | 1.149 | 19.2 | 24000 | 1.1735 |
289
+ | 1.2083 | 19.28 | 24100 | 1.2311 |
290
+ | 1.2362 | 19.36 | 24200 | 1.2287 |
291
+ | 1.2758 | 19.44 | 24300 | 1.2308 |
292
+ | 1.2554 | 19.52 | 24400 | 1.2333 |
293
+ | 1.2907 | 19.6 | 24500 | 1.2203 |
294
+ | 1.2535 | 19.68 | 24600 | 1.2216 |
295
+ | 1.2817 | 19.76 | 24700 | 1.2221 |
296
+ | 1.2834 | 19.84 | 24800 | 1.2164 |
297
+ | 1.2752 | 19.92 | 24900 | 1.2123 |
298
+ | 1.2982 | 20.0 | 25000 | 1.2207 |
299
+ | 1.2229 | 20.08 | 25100 | 1.1983 |
300
+ | 1.2081 | 20.16 | 25200 | 1.1894 |
301
+ | 1.2322 | 20.24 | 25300 | 1.1889 |
302
+ | 1.248 | 20.32 | 25400 | 1.1880 |
303
+ | 1.2237 | 20.4 | 25500 | 1.1826 |
304
+ | 1.237 | 20.48 | 25600 | 1.1731 |
305
+ | 1.23 | 20.56 | 25700 | 1.1791 |
306
+ | 1.2618 | 20.64 | 25800 | 1.1745 |
307
+ | 1.2452 | 20.72 | 25900 | 1.1707 |
308
+ | 1.2475 | 20.8 | 26000 | 1.1642 |
309
+ | 1.257 | 20.88 | 26100 | 1.1740 |
310
+ | 1.2378 | 20.96 | 26200 | 1.1652 |
311
+ | 1.2055 | 21.04 | 26300 | 1.1479 |
312
+ | 1.1479 | 21.12 | 26400 | 1.1450 |
313
+ | 1.1799 | 21.2 | 26500 | 1.1454 |
314
+ | 1.1724 | 21.28 | 26600 | 1.1372 |
315
+ | 1.1852 | 21.36 | 26700 | 1.1409 |
316
+ | 1.1842 | 21.44 | 26800 | 1.1322 |
317
+ | 1.1843 | 21.52 | 26900 | 1.1292 |
318
+ | 1.1875 | 21.6 | 27000 | 1.1245 |
319
+ | 1.1904 | 21.68 | 27100 | 1.1212 |
320
+ | 1.1814 | 21.76 | 27200 | 1.1171 |
321
+ | 1.1906 | 21.84 | 27300 | 1.1105 |
322
+ | 1.2078 | 21.92 | 27400 | 1.1055 |
323
+ | 1.2157 | 22.0 | 27500 | 1.1058 |
324
+ | 1.1111 | 22.08 | 27600 | 1.0881 |
325
+ | 1.109 | 22.16 | 27700 | 1.0827 |
326
+ | 1.1118 | 22.24 | 27800 | 1.0780 |
327
+ | 1.1279 | 22.32 | 27900 | 1.0749 |
328
+ | 1.1435 | 22.4 | 28000 | 1.0727 |
329
+ | 1.1161 | 22.48 | 28100 | 1.0713 |
330
+ | 1.1295 | 22.56 | 28200 | 1.0717 |
331
+ | 1.1439 | 22.64 | 28300 | 1.0660 |
332
+ | 1.1343 | 22.72 | 28400 | 1.0661 |
333
+ | 1.1564 | 22.8 | 28500 | 1.0557 |
334
+ | 1.1542 | 22.88 | 28600 | 1.0540 |
335
+ | 1.1234 | 22.96 | 28700 | 1.0543 |
336
+ | 1.1001 | 23.04 | 28800 | 1.0453 |
337
+ | 1.045 | 23.12 | 28900 | 1.0357 |
338
+ | 1.0757 | 23.2 | 29000 | 1.0308 |
339
+ | 1.083 | 23.28 | 29100 | 1.0259 |
340
+ | 1.0547 | 23.36 | 29200 | 1.0241 |
341
+ | 1.091 | 23.44 | 29300 | 1.0265 |
342
+ | 1.074 | 23.52 | 29400 | 1.0207 |
343
+ | 1.1001 | 23.6 | 29500 | 1.0191 |
344
+ | 1.0884 | 23.68 | 29600 | 1.0205 |
345
+ | 1.0943 | 23.76 | 29700 | 1.0172 |
346
+ | 1.0869 | 23.84 | 29800 | 1.0121 |
347
+ | 1.0925 | 23.92 | 29900 | 1.0094 |
348
+ | 1.0999 | 24.0 | 30000 | 1.0003 |
349
+ | 1.0 | 24.08 | 30100 | 0.9898 |
350
+ | 1.0128 | 24.16 | 30200 | 0.9874 |
351
+ | 1.0056 | 24.24 | 30300 | 0.9833 |
352
+ | 1.0303 | 24.32 | 30400 | 0.9807 |
353
+ | 1.0201 | 24.4 | 30500 | 0.9731 |
354
+ | 1.0371 | 24.48 | 30600 | 0.9743 |
355
+ | 1.0439 | 24.56 | 30700 | 0.9666 |
356
+ | 1.0424 | 24.64 | 30800 | 0.9670 |
357
+ | 1.0281 | 24.72 | 30900 | 0.9662 |
358
+ | 1.0449 | 24.8 | 31000 | 0.9595 |
359
+ | 1.0556 | 24.88 | 31100 | 0.9540 |
360
+ | 1.0589 | 24.96 | 31200 | 0.9552 |
361
+ | 1.0032 | 25.04 | 31300 | 0.9438 |
362
+ | 0.9534 | 25.12 | 31400 | 0.9400 |
363
+ | 0.9932 | 25.2 | 31500 | 0.9360 |
364
+ | 0.9863 | 25.28 | 31600 | 0.9354 |
365
+ | 0.9759 | 25.36 | 31700 | 0.9275 |
366
+ | 0.9761 | 25.44 | 31800 | 0.9310 |
367
+ | 0.9719 | 25.52 | 31900 | 0.9299 |
368
+ | 0.9702 | 25.6 | 32000 | 0.9269 |
369
+ | 1.0005 | 25.68 | 32100 | 0.9217 |
370
+ | 0.9975 | 25.76 | 32200 | 0.9161 |
371
+ | 0.9935 | 25.84 | 32300 | 0.9134 |
372
+ | 1.0178 | 25.92 | 32400 | 0.9145 |
373
+ | 1.011 | 26.0 | 32500 | 0.9098 |
374
+ | 0.9145 | 26.08 | 32600 | 0.8993 |
375
+ | 0.931 | 26.16 | 32700 | 0.8957 |
376
+ | 0.9326 | 26.24 | 32800 | 0.8905 |
377
+ | 0.9421 | 26.32 | 32900 | 0.8898 |
378
+ | 0.949 | 26.4 | 33000 | 0.8879 |
379
+ | 0.9224 | 26.48 | 33100 | 0.8838 |
380
+ | 0.952 | 26.56 | 33200 | 0.8818 |
381
+ | 0.9431 | 26.64 | 33300 | 0.8741 |
382
+ | 0.9463 | 26.72 | 33400 | 0.8747 |
383
+ | 0.9456 | 26.8 | 33500 | 0.8742 |
384
+ | 0.9533 | 26.88 | 33600 | 0.8734 |
385
+ | 0.9643 | 26.96 | 33700 | 0.8643 |
386
+ | 0.9037 | 27.04 | 33800 | 0.8546 |
387
+ | 0.8834 | 27.12 | 33900 | 0.8552 |
388
+ | 0.9008 | 27.2 | 34000 | 0.8519 |
389
+ | 0.8851 | 27.28 | 34100 | 0.8498 |
390
+ | 0.8812 | 27.36 | 34200 | 0.8485 |
391
+ | 0.9006 | 27.44 | 34300 | 0.8435 |
392
+ | 0.8893 | 27.52 | 34400 | 0.8413 |
393
+ | 0.8949 | 27.6 | 34500 | 0.8372 |
394
+ | 0.908 | 27.68 | 34600 | 0.8349 |
395
+ | 0.9121 | 27.76 | 34700 | 0.8312 |
396
+ | 0.9066 | 27.84 | 34800 | 0.8285 |
397
+ | 0.9146 | 27.92 | 34900 | 0.8291 |
398
+ | 0.9217 | 28.0 | 35000 | 0.8280 |
399
+ | 0.8282 | 28.08 | 35100 | 0.8158 |
400
+ | 0.8346 | 28.16 | 35200 | 0.8131 |
401
+ | 0.8503 | 28.24 | 35300 | 0.8133 |
402
+ | 0.8431 | 28.32 | 35400 | 0.8090 |
403
+ | 0.8479 | 28.4 | 35500 | 0.8087 |
404
+ | 0.8604 | 28.48 | 35600 | 0.8062 |
405
+ | 0.8559 | 28.56 | 35700 | 0.8028 |
406
+ | 0.8644 | 28.64 | 35800 | 0.7994 |
407
+ | 0.8761 | 28.72 | 35900 | 0.7983 |
408
+ | 0.8821 | 28.8 | 36000 | 0.7926 |
409
+ | 0.8712 | 28.88 | 36100 | 0.7918 |
410
+ | 0.8725 | 28.96 | 36200 | 0.7903 |
411
+ | 0.834 | 29.04 | 36300 | 0.7816 |
412
+ | 0.8119 | 29.12 | 36400 | 0.7739 |
413
+ | 0.8063 | 29.2 | 36500 | 0.7716 |
414
+ | 0.8097 | 29.28 | 36600 | 0.7719 |
415
+ | 0.8177 | 29.36 | 36700 | 0.7727 |
416
+ | 0.8098 | 29.44 | 36800 | 0.7683 |
417
+ | 0.8103 | 29.52 | 36900 | 0.7682 |
418
+ | 0.8251 | 29.6 | 37000 | 0.7634 |
419
+ | 0.8382 | 29.68 | 37100 | 0.7635 |
420
+ | 0.8193 | 29.76 | 37200 | 0.7609 |
421
+ | 0.85 | 29.84 | 37300 | 0.7631 |
422
+ | 0.8371 | 29.92 | 37400 | 0.7546 |
423
+ | 0.8304 | 30.0 | 37500 | 0.7508 |
424
+ | 0.7676 | 30.08 | 37600 | 0.7474 |
425
+ | 0.7782 | 30.16 | 37700 | 0.7466 |
426
+ | 0.7754 | 30.24 | 37800 | 0.7450 |
427
+ | 0.7774 | 30.32 | 37900 | 0.7406 |
428
+ | 0.7728 | 30.4 | 38000 | 0.7390 |
429
+ | 0.7812 | 30.48 | 38100 | 0.7361 |
430
+ | 0.79 | 30.56 | 38200 | 0.7339 |
431
+ | 0.8072 | 30.64 | 38300 | 0.7323 |
432
+ | 0.8051 | 30.72 | 38400 | 0.7308 |
433
+ | 0.7895 | 30.8 | 38500 | 0.7268 |
434
+ | 0.7932 | 30.88 | 38600 | 0.7251 |
435
+ | 0.7939 | 30.96 | 38700 | 0.7218 |
436
+ | 0.7643 | 31.04 | 38800 | 0.7168 |
437
+ | 0.7378 | 31.12 | 38900 | 0.7143 |
438
+ | 0.7498 | 31.2 | 39000 | 0.7128 |
439
+ | 0.7448 | 31.28 | 39100 | 0.7109 |
440
+ | 0.749 | 31.36 | 39200 | 0.7092 |
441
+ | 0.7558 | 31.44 | 39300 | 0.7080 |
442
+ | 0.7622 | 31.52 | 39400 | 0.7040 |
443
+ | 0.7572 | 31.6 | 39500 | 0.7047 |
444
+ | 0.7578 | 31.68 | 39600 | 0.6997 |
445
+ | 0.7567 | 31.76 | 39700 | 0.6968 |
446
+ | 0.758 | 31.84 | 39800 | 0.6938 |
447
+ | 0.7645 | 31.92 | 39900 | 0.6935 |
448
+ | 0.7728 | 32.0 | 40000 | 0.6932 |
449
+ | 0.7008 | 32.08 | 40100 | 0.6888 |
450
+ | 0.7172 | 32.16 | 40200 | 0.6898 |
451
+ | 0.6954 | 32.24 | 40300 | 0.6858 |
452
+ | 0.7251 | 32.32 | 40400 | 0.6838 |
453
+ | 0.7229 | 32.4 | 40500 | 0.6804 |
454
+ | 0.7263 | 32.48 | 40600 | 0.6781 |
455
+ | 0.7221 | 32.56 | 40700 | 0.6767 |
456
+ | 0.723 | 32.64 | 40800 | 0.6760 |
457
+ | 0.7396 | 32.72 | 40900 | 0.6747 |
458
+ | 0.7349 | 32.8 | 41000 | 0.6710 |
459
+ | 0.7427 | 32.88 | 41100 | 0.6713 |
460
+ | 0.7479 | 32.96 | 41200 | 0.6655 |
461
+ | 0.7212 | 33.04 | 41300 | 0.6650 |
462
+ | 0.6975 | 33.12 | 41400 | 0.6626 |
463
+ | 0.686 | 33.2 | 41500 | 0.6599 |
464
+ | 0.6874 | 33.28 | 41600 | 0.6584 |
465
+ | 0.695 | 33.36 | 41700 | 0.6569 |
466
+ | 0.6854 | 33.44 | 41800 | 0.6561 |
467
+ | 0.6917 | 33.52 | 41900 | 0.6540 |
468
+ | 0.6994 | 33.6 | 42000 | 0.6527 |
469
+ | 0.6939 | 33.68 | 42100 | 0.6540 |
470
+ | 0.7118 | 33.76 | 42200 | 0.6487 |
471
+ | 0.715 | 33.84 | 42300 | 0.6487 |
472
+ | 0.7164 | 33.92 | 42400 | 0.6452 |
473
+ | 0.7116 | 34.0 | 42500 | 0.6450 |
474
+ | 0.6701 | 34.08 | 42600 | 0.6421 |
475
+ | 0.66 | 34.16 | 42700 | 0.6412 |
476
+ | 0.6709 | 34.24 | 42800 | 0.6381 |
477
+ | 0.6708 | 34.32 | 42900 | 0.6382 |
478
+ | 0.6874 | 34.4 | 43000 | 0.6376 |
479
+ | 0.6838 | 34.48 | 43100 | 0.6350 |
480
+ | 0.6721 | 34.56 | 43200 | 0.6340 |
481
+ | 0.6782 | 34.64 | 43300 | 0.6326 |
482
+ | 0.6831 | 34.72 | 43400 | 0.6300 |
483
+ | 0.6897 | 34.8 | 43500 | 0.6304 |
484
+ | 0.679 | 34.88 | 43600 | 0.6281 |
485
+ | 0.6678 | 34.96 | 43700 | 0.6260 |
486
+ | 0.6705 | 35.04 | 43800 | 0.6251 |
487
+ | 0.6443 | 35.12 | 43900 | 0.6226 |
488
+ | 0.6479 | 35.2 | 44000 | 0.6224 |
489
+ | 0.6434 | 35.28 | 44100 | 0.6199 |
490
+ | 0.6461 | 35.36 | 44200 | 0.6196 |
491
+ | 0.6516 | 35.44 | 44300 | 0.6181 |
492
+ | 0.6516 | 35.52 | 44400 | 0.6190 |
493
+ | 0.6667 | 35.6 | 44500 | 0.6171 |
494
+ | 0.6583 | 35.68 | 44600 | 0.6153 |
495
+ | 0.664 | 35.76 | 44700 | 0.6143 |
496
+ | 0.6548 | 35.84 | 44800 | 0.6135 |
497
+ | 0.6713 | 35.92 | 44900 | 0.6118 |
498
+ | 0.6681 | 36.0 | 45000 | 0.6110 |
499
+ | 0.6315 | 36.08 | 45100 | 0.6079 |
500
+ | 0.6451 | 36.16 | 45200 | 0.6084 |
501
+ | 0.6396 | 36.24 | 45300 | 0.6082 |
502
+ | 0.6291 | 36.32 | 45400 | 0.6072 |
503
+ | 0.6391 | 36.4 | 45500 | 0.6060 |
504
+ | 0.6381 | 36.48 | 45600 | 0.6052 |
505
+ | 0.6417 | 36.56 | 45700 | 0.6041 |
506
+ | 0.6347 | 36.64 | 45800 | 0.6036 |
507
+ | 0.6436 | 36.72 | 45900 | 0.6022 |
508
+ | 0.6352 | 36.8 | 46000 | 0.6012 |
509
+ | 0.6515 | 36.88 | 46100 | 0.6005 |
510
+ | 0.63 | 36.96 | 46200 | 0.5992 |
511
+ | 0.6317 | 37.04 | 46300 | 0.5979 |
512
+ | 0.6313 | 37.12 | 46400 | 0.5977 |
513
+ | 0.6226 | 37.2 | 46500 | 0.5971 |
514
+ | 0.6155 | 37.28 | 46600 | 0.5967 |
515
+ | 0.6248 | 37.36 | 46700 | 0.5961 |
516
+ | 0.6329 | 37.44 | 46800 | 0.5958 |
517
+ | 0.6249 | 37.52 | 46900 | 0.5953 |
518
+ | 0.6264 | 37.6 | 47000 | 0.5946 |
519
+ | 0.6271 | 37.68 | 47100 | 0.5941 |
520
+ | 0.6281 | 37.76 | 47200 | 0.5936 |
521
+ | 0.6222 | 37.84 | 47300 | 0.5931 |
522
+ | 0.6133 | 37.92 | 47400 | 0.5925 |
523
+ | 0.6298 | 38.0 | 47500 | 0.5920 |
524
+ | 0.6123 | 38.08 | 47600 | 0.5918 |
525
+ | 0.6073 | 38.16 | 47700 | 0.5918 |
526
+ | 0.6129 | 38.24 | 47800 | 0.5915 |
527
+ | 0.6336 | 38.32 | 47900 | 0.5914 |
528
+ | 0.6094 | 38.4 | 48000 | 0.5914 |
529
 
530
 
531
  ### Framework versions