AlekseyKorshuk commited on
Commit
fc618da
1 Parent(s): 330a9be

huggingartists

Browse files
README.md CHANGED
@@ -14,11 +14,11 @@ widget:
14
  <div class="inline-flex flex-col" style="line-height: 1.5;">
15
  <div class="flex">
16
  <div
17
- style="display:DISPLAY_1; margin-left: auto; margin-right: auto; width: 92px; height:92px; border-radius: 50%; background-size: cover; background-image: url(&#39;https://images.genius.com/df75ede64ffcf049727bfbb01d323081.400x400x1.jpg&#39;)">
18
  </div>
19
  </div>
20
  <div style="text-align: center; margin-top: 3px; font-size: 16px; font-weight: 800">🤖 HuggingArtists Model 🤖</div>
21
- <div style="text-align: center; font-size: 16px; font-weight: 800">The Beatles</div>
22
  <a href="https://genius.com/artists/morgenshtern">
23
  <div style="text-align: center; font-size: 14px;">@morgenshtern</div>
24
  </a>
@@ -34,7 +34,7 @@ To understand how the model was developed, check the [W&B report](https://wandb.
34
 
35
  ## Training data
36
 
37
- The model was trained on lyrics from The Beatles.
38
 
39
  Dataset is available [here](https://huggingface.co/datasets/huggingartists/morgenshtern).
40
  And can be used with:
@@ -45,15 +45,15 @@ from datasets import load_dataset
45
  dataset = load_dataset("huggingartists/morgenshtern")
46
  ```
47
 
48
- [Explore the data](https://wandb.ai/huggingartists/huggingartists/runs/36ru50a4/artifacts), which is tracked with [W&B artifacts](https://docs.wandb.com/artifacts) at every step of the pipeline.
49
 
50
  ## Training procedure
51
 
52
- The model is based on a pre-trained [GPT-2](https://huggingface.co/gpt2) which is fine-tuned on The Beatles's lyrics.
53
 
54
- Hyperparameters and metrics are recorded in the [W&B training run](https://wandb.ai/huggingartists/huggingartists/runs/1k6lslqs) for full transparency and reproducibility.
55
 
56
- At the end of training, [the final model](https://wandb.ai/huggingartists/huggingartists/runs/1k6lslqs/artifacts) is logged and versioned.
57
 
58
  ## How to use
59
 
 
14
  <div class="inline-flex flex-col" style="line-height: 1.5;">
15
  <div class="flex">
16
  <div
17
+ style="display:DISPLAY_1; margin-left: auto; margin-right: auto; width: 92px; height:92px; border-radius: 50%; background-size: cover; background-image: url(&#39;https://images.genius.com/1edcea93261e2e266c532ce204ba92da.1000x1000x1.jpg&#39;)">
18
  </div>
19
  </div>
20
  <div style="text-align: center; margin-top: 3px; font-size: 16px; font-weight: 800">🤖 HuggingArtists Model 🤖</div>
21
+ <div style="text-align: center; font-size: 16px; font-weight: 800">MORGENSHTERN</div>
22
  <a href="https://genius.com/artists/morgenshtern">
23
  <div style="text-align: center; font-size: 14px;">@morgenshtern</div>
24
  </a>
 
34
 
35
  ## Training data
36
 
37
+ The model was trained on lyrics from MORGENSHTERN.
38
 
39
  Dataset is available [here](https://huggingface.co/datasets/huggingartists/morgenshtern).
40
  And can be used with:
 
45
  dataset = load_dataset("huggingartists/morgenshtern")
46
  ```
47
 
48
+ [Explore the data](https://wandb.ai/huggingartists/huggingartists/runs/3of8bax2/artifacts), which is tracked with [W&B artifacts](https://docs.wandb.com/artifacts) at every step of the pipeline.
49
 
50
  ## Training procedure
51
 
52
+ The model is based on a pre-trained [GPT-2](https://huggingface.co/gpt2) which is fine-tuned on MORGENSHTERN's lyrics.
53
 
54
+ Hyperparameters and metrics are recorded in the [W&B training run](https://wandb.ai/huggingartists/huggingartists/runs/29va0sby) for full transparency and reproducibility.
55
 
56
+ At the end of training, [the final model](https://wandb.ai/huggingartists/huggingartists/runs/29va0sby/artifacts) is logged and versioned.
57
 
58
  ## How to use
59
 
config.json CHANGED
@@ -35,7 +35,7 @@
35
  }
36
  },
37
  "torch_dtype": "float32",
38
- "transformers_version": "4.9.2",
39
  "use_cache": true,
40
  "vocab_size": 50257
41
  }
 
35
  }
36
  },
37
  "torch_dtype": "float32",
38
+ "transformers_version": "4.10.0",
39
  "use_cache": true,
40
  "vocab_size": 50257
41
  }
evaluation.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ {"eval_loss": 1.546966552734375, "eval_runtime": 6.4283, "eval_samples_per_second": 22.556, "eval_steps_per_second": 2.956, "epoch": 3.0}
flax_model.msgpack CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3d11136176323aafb2ef75f5525ef66852770716f4ced58db73503e0a7484137
3
  size 497764120
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a828940fd9988b72532dc4a17c18b562f948e61d3a0c8b6be9e117dee381362e
3
  size 497764120
optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:df0396891551f1f573ada519077a8bf740ef79c72b09df5bc47b336bcfae1a01
3
- size 995603825
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aa430dc03f846a18abdb51d16edff75c0f3552a1f8035dae856393612fde1f9a
3
+ size 995604017
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:14245ab76b0bcb59d2619dfecebb00c58ba368ba92e4979db4a5e50454a3f65d
3
  size 510403817
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c2264a9b234d1b11adb7339ea59e58dad473cf0dc99c0f1d311469e6ce8e50fe
3
  size 510403817
rng_state.pth CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:aec18bd090ee79f7be43632d1d02335edd519ec6f49a3a61a5f244bf515bf8da
3
  size 14567
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5dbad4e46f0b23a7fbe31e6a10224311e7ec288f4ef415ae360dec29f4e7661a
3
  size 14567
scheduler.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cae94fe29647f1ab9ebfc3069e27ada487df598ed599d7fbb4182e85d06b41b1
3
  size 623
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:340c51ed8fb1370066103a87c329a9f9d39d9f589b8a9a525ff69c489da5b8e5
3
  size 623
trainer_state.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "best_metric": null,
3
- "best_model_checkpoint": null,
4
- "epoch": 2.0,
5
- "global_step": 232,
6
  "is_hyper_param_search": false,
7
  "is_local_process_zero": true,
8
  "is_world_process_zero": true,
@@ -322,11 +322,91 @@
322
  "learning_rate": 0.0001370993921901871,
323
  "loss": 1.7228,
324
  "step": 230
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
325
  }
326
  ],
327
- "max_steps": 232,
328
- "num_train_epochs": 2,
329
- "total_flos": 241695129600000.0,
330
  "trial_name": null,
331
  "trial_params": null
332
  }
 
1
  {
2
+ "best_metric": 1.546966552734375,
3
+ "best_model_checkpoint": "output/morgenshtern/checkpoint-294",
4
+ "epoch": 3.0,
5
+ "global_step": 294,
6
  "is_hyper_param_search": false,
7
  "is_local_process_zero": true,
8
  "is_world_process_zero": true,
 
322
  "learning_rate": 0.0001370993921901871,
323
  "loss": 1.7228,
324
  "step": 230
325
+ },
326
+ {
327
+ "epoch": 2.4,
328
+ "learning_rate": 9.021642375642038e-05,
329
+ "loss": 1.6079,
330
+ "step": 235
331
+ },
332
+ {
333
+ "epoch": 2.45,
334
+ "learning_rate": 7.954855279928984e-05,
335
+ "loss": 1.6691,
336
+ "step": 240
337
+ },
338
+ {
339
+ "epoch": 2.5,
340
+ "learning_rate": 6.860000000000001e-05,
341
+ "loss": 1.7047,
342
+ "step": 245
343
+ },
344
+ {
345
+ "epoch": 2.55,
346
+ "learning_rate": 5.765144720071019e-05,
347
+ "loss": 1.6921,
348
+ "step": 250
349
+ },
350
+ {
351
+ "epoch": 2.6,
352
+ "learning_rate": 4.698357624357961e-05,
353
+ "loss": 1.5894,
354
+ "step": 255
355
+ },
356
+ {
357
+ "epoch": 2.65,
358
+ "learning_rate": 3.686987328947878e-05,
359
+ "loss": 1.6388,
360
+ "step": 260
361
+ },
362
+ {
363
+ "epoch": 2.7,
364
+ "learning_rate": 2.7569617608302645e-05,
365
+ "loss": 1.6748,
366
+ "step": 265
367
+ },
368
+ {
369
+ "epoch": 2.76,
370
+ "learning_rate": 1.932123458329584e-05,
371
+ "loss": 1.6765,
372
+ "step": 270
373
+ },
374
+ {
375
+ "epoch": 2.81,
376
+ "learning_rate": 1.233618333464885e-05,
377
+ "loss": 1.6658,
378
+ "step": 275
379
+ },
380
+ {
381
+ "epoch": 2.86,
382
+ "learning_rate": 6.793535661894062e-06,
383
+ "loss": 1.5677,
384
+ "step": 280
385
+ },
386
+ {
387
+ "epoch": 2.91,
388
+ "learning_rate": 2.8353852816850843e-06,
389
+ "loss": 1.6118,
390
+ "step": 285
391
+ },
392
+ {
393
+ "epoch": 2.96,
394
+ "learning_rate": 5.632050517253132e-07,
395
+ "loss": 1.552,
396
+ "step": 290
397
+ },
398
+ {
399
+ "epoch": 3.0,
400
+ "eval_loss": 1.546966552734375,
401
+ "eval_runtime": 6.307,
402
+ "eval_samples_per_second": 22.99,
403
+ "eval_steps_per_second": 3.013,
404
+ "step": 294
405
  }
406
  ],
407
+ "max_steps": 294,
408
+ "num_train_epochs": 3,
409
+ "total_flos": 306103615488000.0,
410
  "trial_name": null,
411
  "trial_params": null
412
  }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e3f7789495a48c9ed1372c3a20ff68e3fd471ceffc8c79810dc223ac2f95c6ed
3
  size 2671
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eda2c3fc1a169357c5ab29108c2671d35b5d8c42cbde93e4b348dab2cf8667ff
3
  size 2671