RASMUS commited on
Commit
b7b0d85
β€’
1 Parent(s): bfaf210

Training in progress, step 31000

Browse files
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:837fabe2f9887cb975f4c577c865c8c31d3ecc4fcc9afcd2a38e8c97918a8409
3
  size 3219908024
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:df95fbcfc0eb4bfed629e4280bbb789a37aa3cf5571ceb0ee20543d94a2cc062
3
  size 3219908024
wandb/debug-internal.log CHANGED
The diff for this file is too large to render. See raw diff
 
wandb/run-20231122_234140-blvl5h7j/files/output.log CHANGED
@@ -2274,3 +2274,1075 @@ Reading metadata...: 1it [00:00, 6.49it/s]
2274
 
2275
 
2276
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2274
 
2275
 
2276
 
2277
+
2278
+
2279
+
2280
+
2281
+
2282
+
2283
+
2284
+
2285
+
2286
+
2287
+
2288
+
2289
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30079/60000 [12:18:14<80:32:25, 9.69s/it]
2290
+
2291
+
2292
+
2293
+
2294
+
2295
+
2296
+
2297
+
2298
+
2299
+
2300
+
2301
+
2302
+
2303
+
2304
+
2305
+
2306
+
2307
+
2308
+
2309
+
2310
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30099/60000 [12:21:31<80:38:44, 9.71s/it]
2311
+
2312
+
2313
+
2314
+
2315
+
2316
+
2317
+
2318
+
2319
+
2320
+
2321
+
2322
+
2323
+
2324
+
2325
+
2326
+
2327
+
2328
+
2329
+
2330
+
2331
+
2332
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30120/60000 [12:25:23<83:32:40, 10.07s/it]
2333
+
2334
+
2335
+
2336
+
2337
+
2338
+
2339
+
2340
+
2341
+
2342
+
2343
+
2344
+
2345
+
2346
+
2347
+
2348
+
2349
+
2350
+
2351
+
2352
+
2353
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30140/60000 [12:28:42<82:54:00, 9.99s/it]
2354
+
2355
+
2356
+
2357
+
2358
+
2359
+
2360
+
2361
+
2362
+
2363
+
2364
+
2365
+
2366
+
2367
+
2368
+
2369
+
2370
+
2371
+
2372
+
2373
+
2374
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30160/60000 [12:31:52<78:04:06, 9.42s/it]
2375
+
2376
+
2377
+
2378
+
2379
+
2380
+
2381
+
2382
+
2383
+
2384
+
2385
+
2386
+
2387
+
2388
+
2389
+
2390
+
2391
+
2392
+
2393
+
2394
+
2395
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30180/60000 [12:35:03<79:27:09, 9.59s/it]
2396
+
2397
+
2398
+
2399
+
2400
+
2401
+
2402
+
2403
+
2404
+
2405
+
2406
+
2407
+
2408
+
2409
+
2410
+
2411
+
2412
+
2413
+
2414
+
2415
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30199/60000 [12:38:06<79:25:53, 9.60s/it]
2416
+
2417
+
2418
+
2419
+
2420
+
2421
+
2422
+
2423
+
2424
+
2425
+
2426
+
2427
+
2428
+
2429
+
2430
+
2431
+
2432
+
2433
+
2434
+
2435
+
2436
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30219/60000 [12:41:16<78:47:02, 9.52s/it]
2437
+
2438
+
2439
+
2440
+
2441
+
2442
+
2443
+
2444
+
2445
+
2446
+
2447
+
2448
+
2449
+
2450
+
2451
+
2452
+
2453
+
2454
+
2455
+
2456
+
2457
+
2458
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30240/60000 [12:44:41<77:42:06, 9.40s/it]
2459
+
2460
+
2461
+
2462
+
2463
+
2464
+
2465
+
2466
+
2467
+
2468
+
2469
+
2470
+
2471
+
2472
+
2473
+
2474
+
2475
+
2476
+
2477
+
2478
+
2479
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30260/60000 [12:47:50<78:27:31, 9.50s/it]
2480
+
2481
+
2482
+
2483
+
2484
+
2485
+
2486
+
2487
+
2488
+
2489
+
2490
+
2491
+
2492
+
2493
+
2494
+
2495
+
2496
+
2497
+
2498
+
2499
+
2500
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30280/60000 [12:51:14<84:49:11, 10.27s/it]
2501
+
2502
+
2503
+
2504
+
2505
+
2506
+
2507
+
2508
+
2509
+
2510
+
2511
+
2512
+
2513
+
2514
+
2515
+
2516
+
2517
+
2518
+
2519
+
2520
+
2521
+ 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30300/60000 [12:54:59<93:55:24, 11.38s/it]
2522
+
2523
+
2524
+
2525
+
2526
+
2527
+
2528
+
2529
+
2530
+
2531
+
2532
+
2533
+
2534
+
2535
+
2536
+
2537
+
2538
+
2539
+
2540
+
2541
+
2542
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30320/60000 [12:58:16<79:14:44, 9.61s/it]
2543
+
2544
+
2545
+
2546
+
2547
+
2548
+
2549
+
2550
+
2551
+
2552
+
2553
+
2554
+
2555
+
2556
+
2557
+
2558
+
2559
+
2560
+
2561
+
2562
+
2563
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30340/60000 [13:01:28<79:01:40, 9.59s/it]
2564
+
2565
+
2566
+
2567
+
2568
+
2569
+
2570
+
2571
+
2572
+
2573
+
2574
+
2575
+
2576
+
2577
+
2578
+
2579
+
2580
+
2581
+
2582
+
2583
+
2584
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 30360/60000 [13:04:38<77:57:09, 9.47s/it]
2585
+
2586
+
2587
+
2588
+
2589
+
2590
+
2591
+
2592
+
2593
+
2594
+
2595
+
2596
+
2597
+
2598
+
2599
+
2600
+
2601
+
2602
+
2603
+
2604
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30379/60000 [13:07:40<78:18:06, 9.52s/it]
2605
+
2606
+
2607
+
2608
+
2609
+
2610
+
2611
+
2612
+
2613
+
2614
+
2615
+
2616
+
2617
+
2618
+
2619
+
2620
+
2621
+
2622
+
2623
+
2624
+
2625
+
2626
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30400/60000 [13:11:02<79:33:20, 9.68s/it]
2627
+
2628
+
2629
+
2630
+
2631
+
2632
+
2633
+
2634
+
2635
+
2636
+
2637
+
2638
+
2639
+
2640
+
2641
+
2642
+
2643
+
2644
+
2645
+
2646
+
2647
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30420/60000 [13:14:40<80:27:51, 9.79s/it]
2648
+
2649
+
2650
+
2651
+
2652
+
2653
+
2654
+
2655
+
2656
+
2657
+
2658
+
2659
+
2660
+
2661
+
2662
+
2663
+
2664
+
2665
+
2666
+
2667
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30439/60000 [13:17:44<80:26:22, 9.80s/it]
2668
+
2669
+
2670
+
2671
+
2672
+
2673
+
2674
+
2675
+
2676
+
2677
+
2678
+
2679
+
2680
+
2681
+
2682
+
2683
+
2684
+
2685
+
2686
+
2687
+
2688
+
2689
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30460/60000 [13:21:07<78:05:36, 9.52s/it]
2690
+
2691
+
2692
+
2693
+
2694
+
2695
+
2696
+
2697
+
2698
+
2699
+
2700
+
2701
+
2702
+
2703
+
2704
+
2705
+
2706
+
2707
+
2708
+
2709
+
2710
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30480/60000 [13:24:20<78:26:07, 9.57s/it]
2711
+
2712
+
2713
+
2714
+
2715
+
2716
+
2717
+
2718
+
2719
+
2720
+
2721
+
2722
+
2723
+
2724
+
2725
+
2726
+
2727
+
2728
+
2729
+
2730
+
2731
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30500/60000 [13:27:41<79:28:50, 9.70s/it]
2732
+
2733
+
2734
+
2735
+
2736
+
2737
+
2738
+
2739
+
2740
+
2741
+
2742
+
2743
+
2744
+
2745
+
2746
+
2747
+
2748
+
2749
+
2750
+
2751
+
2752
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30520/60000 [13:30:53<78:24:01, 9.57s/it]
2753
+
2754
+
2755
+
2756
+
2757
+
2758
+
2759
+
2760
+
2761
+
2762
+
2763
+
2764
+
2765
+
2766
+
2767
+
2768
+
2769
+
2770
+
2771
+
2772
+
2773
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30540/60000 [13:34:06<78:36:09, 9.61s/it]
2774
+
2775
+
2776
+
2777
+
2778
+
2779
+
2780
+
2781
+
2782
+
2783
+
2784
+
2785
+
2786
+
2787
+
2788
+
2789
+
2790
+
2791
+
2792
+
2793
+
2794
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30560/60000 [13:37:18<81:20:11, 9.95s/it]
2795
+
2796
+
2797
+
2798
+
2799
+
2800
+
2801
+
2802
+
2803
+
2804
+
2805
+
2806
+
2807
+
2808
+
2809
+
2810
+
2811
+
2812
+
2813
+
2814
+
2815
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30580/60000 [13:40:36<82:45:06, 10.13s/it]
2816
+
2817
+
2818
+
2819
+
2820
+
2821
+
2822
+
2823
+
2824
+
2825
+
2826
+
2827
+
2828
+
2829
+
2830
+
2831
+
2832
+
2833
+
2834
+
2835
+
2836
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30600/60000 [13:43:49<78:35:08, 9.62s/it]
2837
+
2838
+
2839
+
2840
+
2841
+
2842
+
2843
+
2844
+
2845
+
2846
+
2847
+
2848
+
2849
+
2850
+
2851
+
2852
+
2853
+
2854
+
2855
+
2856
+
2857
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30620/60000 [13:47:10<78:22:15, 9.60s/it]
2858
+
2859
+
2860
+
2861
+
2862
+
2863
+
2864
+
2865
+
2866
+
2867
+
2868
+
2869
+
2870
+
2871
+
2872
+
2873
+
2874
+
2875
+
2876
+
2877
+
2878
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30640/60000 [13:50:20<70:25:28, 8.64s/it]
2879
+ [2023-11-23 13:32:07,459] [INFO] [loss_scaler.py:190:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 131072, but hysteresis is 2. Reducing hysteresis to 1
2880
+
2881
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30641/60000 [13:50:26<64:34:06, 7.92s/it]
2882
+
2883
+
2884
+
2885
+
2886
+
2887
+
2888
+
2889
+
2890
+
2891
+
2892
+
2893
+
2894
+
2895
+
2896
+
2897
+
2898
+
2899
+
2900
+
2901
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30660/60000 [13:53:29<79:20:08, 9.73s/it]
2902
+
2903
+ Reading metadata...: 1650it [00:00, 10271.70it/s]78:55:47, 9.68s/it]
2904
+
2905
+
2906
+
2907
+
2908
+
2909
+
2910
+
2911
+
2912
+
2913
+
2914
+
2915
+
2916
+
2917
+
2918
+
2919
+
2920
+
2921
+
2922
+
2923
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30680/60000 [13:56:51<78:41:06, 9.66s/it]
2924
+
2925
+
2926
+
2927
+
2928
+
2929
+
2930
+
2931
+
2932
+
2933
+
2934
+
2935
+
2936
+
2937
+
2938
+
2939
+
2940
+
2941
+
2942
+
2943
+
2944
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30700/60000 [14:00:04<78:09:47, 9.60s/it]
2945
+
2946
+
2947
+
2948
+
2949
+
2950
+
2951
+
2952
+
2953
+
2954
+
2955
+
2956
+
2957
+
2958
+
2959
+
2960
+
2961
+
2962
+
2963
+
2964
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30719/60000 [14:03:08<79:19:49, 9.75s/it]
2965
+
2966
+
2967
+
2968
+
2969
+
2970
+
2971
+
2972
+
2973
+
2974
+
2975
+
2976
+
2977
+
2978
+
2979
+
2980
+
2981
+
2982
+
2983
+
2984
+
2985
+
2986
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 30740/60000 [14:06:32<79:48:04, 9.82s/it]
2987
+
2988
+
2989
+
2990
+
2991
+
2992
+
2993
+
2994
+
2995
+
2996
+
2997
+
2998
+
2999
+
3000
+
3001
+
3002
+
3003
+
3004
+
3005
+
3006
+
3007
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30760/60000 [14:09:50<80:23:14, 9.90s/it]
3008
+
3009
+
3010
+
3011
+
3012
+
3013
+
3014
+
3015
+
3016
+
3017
+
3018
+
3019
+
3020
+
3021
+
3022
+
3023
+
3024
+
3025
+
3026
+
3027
+
3028
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30780/60000 [14:13:03<78:14:38, 9.64s/it]
3029
+
3030
+
3031
+
3032
+
3033
+
3034
+
3035
+
3036
+
3037
+
3038
+
3039
+
3040
+
3041
+
3042
+
3043
+
3044
+
3045
+
3046
+
3047
+
3048
+
3049
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 30800/60000 [14:16:44<100:31:41, 12.39s/it]
3050
+
3051
+
3052
+
3053
+
3054
+
3055
+
3056
+
3057
+
3058
+
3059
+
3060
+
3061
+
3062
+
3063
+
3064
+
3065
+
3066
+
3067
+
3068
+
3069
+
3070
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30820/60000 [14:20:09<80:03:04, 9.88s/it]
3071
+
3072
+
3073
+
3074
+
3075
+
3076
+
3077
+
3078
+
3079
+
3080
+
3081
+
3082
+
3083
+
3084
+
3085
+
3086
+
3087
+
3088
+
3089
+
3090
+
3091
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30840/60000 [14:23:52<78:43:13, 9.72s/it]
3092
+
3093
+
3094
+
3095
+
3096
+
3097
+
3098
+
3099
+
3100
+
3101
+
3102
+
3103
+
3104
+
3105
+
3106
+
3107
+
3108
+
3109
+
3110
+
3111
+
3112
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30860/60000 [14:27:41<80:51:40, 9.99s/it]
3113
+
3114
+
3115
+
3116
+
3117
+
3118
+
3119
+
3120
+
3121
+
3122
+
3123
+
3124
+
3125
+
3126
+
3127
+
3128
+
3129
+
3130
+
3131
+
3132
+
3133
+ 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30880/60000 [14:30:55<78:07:44, 9.66s/it]
3134
+
3135
+
3136
+
3137
+
3138
+
3139
+
3140
+
3141
+
3142
+
3143
+
3144
+
3145
+
3146
+
3147
+
3148
+
3149
+
3150
+
3151
+
3152
+
3153
+
3154
+ 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30900/60000 [14:34:12<78:39:08, 9.73s/it]
3155
+
3156
+
3157
+
3158
+
3159
+
3160
+
3161
+
3162
+
3163
+
3164
+
3165
+
3166
+
3167
+
3168
+
3169
+
3170
+
3171
+
3172
+
3173
+
3174
+
3175
+ 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30920/60000 [14:37:26<78:29:36, 9.72s/it]
3176
+
3177
+
3178
+
3179
+
3180
+
3181
+
3182
+
3183
+
3184
+
3185
+
3186
+
3187
+
3188
+
3189
+
3190
+
3191
+
3192
+
3193
+
3194
+
3195
+
3196
+ 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30940/60000 [14:40:44<77:20:19, 9.58s/it]
3197
+
3198
+
3199
+
3200
+
3201
+
3202
+
3203
+
3204
+
3205
+
3206
+
3207
+
3208
+
3209
+
3210
+
3211
+
3212
+
3213
+
3214
+
3215
+
3216
+
3217
+ 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30960/60000 [14:44:07<75:55:03, 9.41s/it]
3218
+
3219
+
3220
+
3221
+
3222
+
3223
+
3224
+
3225
+
3226
+
3227
+
3228
+
3229
+
3230
+
3231
+
3232
+
3233
+
3234
+
3235
+
3236
+
3237
+ 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30979/60000 [14:47:09<78:34:27, 9.75s/it]
3238
+
3239
+
3240
+
3241
+
3242
+
3243
+
3244
+
3245
+
3246
+
3247
+
3248
+
3249
+
3250
+
3251
+
3252
+
3253
+
3254
+
3255
+
3256
+
3257
+
3258
+ 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 31000/60000 [14:50:32<77:14:57, 9.59s/it][INFO|trainer.py:3173] 2023-11-23 14:32:19,981 >> ***** Running Evaluation *****
3259
+ [INFO|trainer.py:3177] 2023-11-23 14:32:19,981 >> Num examples: Unknown
3260
+ [INFO|trainer.py:3178] 2023-11-23 14:32:19,981 >> Batch size = 4
3261
+ {'loss': 0.0228, 'learning_rate': 1.4771186440677966e-06, 'epoch': 0.52}
3262
+ Reading metadata...: 1704it [00:00, 9424.36it/s]
3263
+ [INFO|trainer_utils.py:759] 2023-11-23 14:32:21,199 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: down_votes, up_votes, gender, locale, segment, client_id, accent, input_length, path, age. If down_votes, up_votes, gender, locale, segment, client_id, accent, input_length, path, age are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message.
3264
+ {'eval_loss': 0.202880859375, 'eval_wer': 8.386732001507728, 'eval_runtime': 595.6213, 'eval_samples_per_second': 2.861, 'eval_steps_per_second': 0.715, 'epoch': 0.52}
3265
+ 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 31000/60000 [15:00:28<77:14:57, 9.59s/it][INFO|trainer.py:2896] 2023-11-23 14:42:39,683 >> Saving model checkpoint to ./checkpoint-31000
3266
+ [INFO|configuration_utils.py:462] 2023-11-23 14:42:39,693 >> Configuration saved in ./checkpoint-31000/config.json
3267
+ [INFO|configuration_utils.py:568] 2023-11-23 14:42:39,711 >> Configuration saved in ./checkpoint-31000/generation_config.json
3268
+ [2023-11-23 14:43:25,683] [INFO] [logging.py:96:log_dist] [Rank 0] [Torch] Checkpoint global_step31000 is about to be saved!
3269
+ [2023-11-23 14:43:25,724] [INFO] [logging.py:96:log_dist] [Rank 0] Saving model checkpoint: ./checkpoint-31000/global_step31000/mp_rank_00_model_states.pt
3270
+ [2023-11-23 14:43:25,724] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving ./checkpoint-31000/global_step31000/mp_rank_00_model_states.pt...
3271
+ [INFO|modeling_utils.py:2194] 2023-11-23 14:43:25,656 >> Model weights saved in ./checkpoint-31000/pytorch_model.bin
3272
+ [INFO|feature_extraction_utils.py:425] 2023-11-23 14:43:25,662 >> Feature extractor saved in ./checkpoint-31000/preprocessor_config.json
3273
+ [2023-11-23 14:43:39,756] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved ./checkpoint-31000/global_step31000/mp_rank_00_model_states.pt.
3274
+ [2023-11-23 14:43:39,775] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving ./checkpoint-31000/global_step31000/zero_pp_rank_0_mp_rank_00_optim_states.pt...
3275
+ [2023-11-23 14:44:06,768] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved ./checkpoint-31000/global_step31000/zero_pp_rank_0_mp_rank_00_optim_states.pt.
3276
+ [2023-11-23 14:44:06,798] [INFO] [engine.py:3417:_save_zero_checkpoint] zero checkpoint saved ./checkpoint-31000/global_step31000/zero_pp_rank_0_mp_rank_00_optim_states.pt
3277
+ [2023-11-23 14:44:06,798] [INFO] [torch_checkpoint_engine.py:33:commit] [Torch] Checkpoint global_step31000 is ready now!
3278
+ [INFO|feature_extraction_utils.py:425] 2023-11-23 14:45:03,018 >> Feature extractor saved in ./preprocessor_config.json
3279
+
3280
+
3281
+
3282
+
3283
+
3284
+
3285
+
3286
+
3287
+
3288
+
3289
+
3290
+
3291
+
3292
+
3293
+
3294
+
3295
+
3296
+
3297
+ 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 31019/60000 [15:06:34<95:41:12, 11.89s/it]
3298
+
3299
+
3300
+
3301
+
3302
+
3303
+
3304
+
3305
+
3306
+
3307
+
3308
+
3309
+
3310
+
3311
+
3312
+
3313
+
3314
+
3315
+
3316
+
3317
+
3318
+ 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 31039/60000 [15:09:53<79:19:11, 9.86s/it]
3319
+
3320
+
3321
+
3322
+
3323
+
3324
+
3325
+
3326
+
3327
+
3328
+
3329
+
3330
+
3331
+
3332
+
3333
+
3334
+
3335
+
3336
+
3337
+
3338
+
3339
+ 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 31059/60000 [15:13:09<78:01:30, 9.71s/it]
3340
+
3341
+
3342
+
3343
+
3344
+
3345
+
3346
+
3347
+
3348
+
wandb/run-20231122_234140-blvl5h7j/files/wandb-summary.json CHANGED
@@ -1 +1 @@
1
- {"train/loss": 0.0812, "train/learning_rate": 1.5248135593220338e-06, "train/epoch": 0.5, "train/global_step": 30060, "_timestamp": 1700733415.2111008, "_runtime": 44114.32262992859, "_step": 104, "eval/loss": 0.193603515625, "eval/wer": 8.358462118356577, "eval/runtime": 600.6457, "eval/samples_per_second": 2.837, "eval/steps_per_second": 0.709}
 
1
+ {"train/loss": 0.039, "train/learning_rate": 1.4740677966101696e-06, "train/epoch": 0.52, "train/global_step": 31060, "_timestamp": 1700744106.793268, "_runtime": 54805.90479707718, "_step": 155, "eval/loss": 0.202880859375, "eval/wer": 8.386732001507728, "eval/runtime": 595.6213, "eval/samples_per_second": 2.861, "eval/steps_per_second": 0.715}
wandb/run-20231122_234140-blvl5h7j/logs/debug-internal.log CHANGED
The diff for this file is too large to render. See raw diff
 
wandb/run-20231122_234140-blvl5h7j/run-blvl5h7j.wandb CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2907ff4ad50e187d1d77714ad79bcb13875da01d13afbec0001cade84a998108
3
- size 1994140
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e1d76afc306ea64fa0af8f18e568ce1cd446c18118aa650f13d61464fe0dfd0c
3
+ size 2588702