srsawant34 commited on Dec 18, 2023

Commit

2a95641

1 Parent(s): d4e7c7a

Upload folder using huggingface_hub

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

1_Pooling/config.json +7 -0
README.md +107 -0
checkpoint-1148/model.safetensors +3 -0
checkpoint-1148/optimizer.pt +3 -0
checkpoint-1148/rng_state.pth +3 -0
checkpoint-1148/scheduler.pt +3 -0
checkpoint-1148/trainer_state.json +45 -0
checkpoint-1148/training_args.bin +3 -0
checkpoint-1435/model.safetensors +3 -0
checkpoint-1435/optimizer.pt +3 -0
checkpoint-1435/rng_state.pth +3 -0
checkpoint-1435/scheduler.pt +3 -0
checkpoint-1435/trainer_state.json +51 -0
checkpoint-1435/training_args.bin +3 -0
checkpoint-1722/model.safetensors +3 -0
checkpoint-1722/optimizer.pt +3 -0
checkpoint-1722/rng_state.pth +3 -0
checkpoint-1722/scheduler.pt +3 -0
checkpoint-1722/trainer_state.json +57 -0
checkpoint-1722/training_args.bin +3 -0
checkpoint-2009/model.safetensors +3 -0
checkpoint-2009/optimizer.pt +3 -0
checkpoint-2009/rng_state.pth +3 -0
checkpoint-2009/scheduler.pt +3 -0
checkpoint-2009/trainer_state.json +63 -0
checkpoint-2009/training_args.bin +3 -0
checkpoint-2296/model.safetensors +3 -0
checkpoint-2296/optimizer.pt +3 -0
checkpoint-2296/rng_state.pth +3 -0
checkpoint-2296/scheduler.pt +3 -0
checkpoint-2296/trainer_state.json +69 -0
checkpoint-2296/training_args.bin +3 -0
checkpoint-287/model.safetensors +3 -0
checkpoint-287/optimizer.pt +3 -0
checkpoint-287/rng_state.pth +3 -0
checkpoint-287/scheduler.pt +3 -0
checkpoint-287/trainer_state.json +27 -0
checkpoint-287/training_args.bin +3 -0
checkpoint-574/model.safetensors +3 -0
checkpoint-574/optimizer.pt +3 -0
checkpoint-574/rng_state.pth +3 -0
checkpoint-574/scheduler.pt +3 -0
checkpoint-574/trainer_state.json +33 -0
checkpoint-574/training_args.bin +3 -0
checkpoint-8323/model.safetensors +3 -0
checkpoint-8323/optimizer.pt +3 -0
checkpoint-8323/rng_state.pth +3 -0
checkpoint-8323/scheduler.pt +3 -0
checkpoint-8323/trainer_state.json +195 -0
checkpoint-8323/training_args.bin +3 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "word_embedding_dimension": 384,
+  "pooling_mode_cls_token": false,
+  "pooling_mode_mean_tokens": true,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false
+}

README.md ADDED Viewed

	@@ -0,0 +1,107 @@

+---
+pipeline_tag: sentence-similarity
+license: apache-2.0
+tags:
+- sentence-transformers
+- feature-extraction
+- sentence-similarity
+- transformers
+---
+# sentence-transformers/paraphrase-MiniLM-L6-v2
+This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.
+## Usage (Sentence-Transformers)
+Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
+```
+pip install -U sentence-transformers
+```
+Then you can use the model like this:
+```python
+from sentence_transformers import SentenceTransformer
+sentences = ["This is an example sentence", "Each sentence is converted"]
+model = SentenceTransformer('sentence-transformers/paraphrase-MiniLM-L6-v2')
+embeddings = model.encode(sentences)
+print(embeddings)
+```
+## Usage (HuggingFace Transformers)
+Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
+```python
+from transformers import AutoTokenizer, AutoModel
+import torch
+#Mean Pooling - Take attention mask into account for correct averaging
+def mean_pooling(model_output, attention_mask):
+    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
+    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
+    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
+# Sentences we want sentence embeddings for
+sentences = ['This is an example sentence', 'Each sentence is converted']
+# Load model from HuggingFace Hub
+tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/paraphrase-MiniLM-L6-v2')
+model = AutoModel.from_pretrained('sentence-transformers/paraphrase-MiniLM-L6-v2')
+# Tokenize sentences
+encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
+# Compute token embeddings
+with torch.no_grad():
+    model_output = model(**encoded_input)
+# Perform pooling. In this case, max pooling.
+sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
+print("Sentence embeddings:")
+print(sentence_embeddings)
+```
+## Evaluation Results
+For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=sentence-transformers/paraphrase-MiniLM-L6-v2)
+## Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
+  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
+)
+```
+## Citing & Authors
+This model was trained by [sentence-transformers](https://www.sbert.net/).
+If you find this model helpful, feel free to cite our publication [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084):
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "http://arxiv.org/abs/1908.10084",
+}
+```

checkpoint-1148/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0b2098e96fe45a5f982c95e9326eeb8656a11a99113e82ae00dda07104972886
+size 90866120

checkpoint-1148/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3179416f106ae2f73588f990002ac14e2c2d259eaba15ca06fb8f9e8e63d390e
+size 180607738

checkpoint-1148/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:87082ae80ddb9d34dbe1ab1348f8126541868839519795215c9dcc059ab63fc6
+size 14244

checkpoint-1148/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5c37d10e949e0aa0b73992efa2d5c4e60e9e1eb519cda5feb64f0ab41724e9f3
+size 1064

checkpoint-1148/trainer_state.json ADDED Viewed

	@@ -0,0 +1,45 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 4.0,
+  "eval_steps": 500,
+  "global_step": 1148,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 1.0,
+      "learning_rate": 0.0004375,
+      "loss": 2.2956,
+      "step": 287
+    },
+    {
+      "epoch": 2.0,
+      "learning_rate": 0.000375,
+      "loss": 2.2508,
+      "step": 574
+    },
+    {
+      "epoch": 3.0,
+      "learning_rate": 0.0003125,
+      "loss": 2.223,
+      "step": 861
+    },
+    {
+      "epoch": 4.0,
+      "learning_rate": 0.00025,
+      "loss": 2.1849,
+      "step": 1148
+    }
+  ],
+  "logging_steps": 500,
+  "max_steps": 2296,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 8,
+  "save_steps": 500,
+  "total_flos": 0.0,
+  "train_batch_size": 64,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-1148/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:59691d62067580cb96432d7835b9a22fa0cdbf9d683d1ce4d96a99344613e85b
+size 4792

checkpoint-1435/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:dc8c4d32a407aaa87a88f0e6bbd1ac189f2a27e026dd588466a5f8b37576ff9b
+size 90866120

checkpoint-1435/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d10bd4efa881893777d05d306a2a6cabf7b54b6127e45e93eb4bb7d740c6cdb3
+size 180607738

checkpoint-1435/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:50e824cfb48c98373f53fa9ca5548f2cb8c492dbb792ffd913076a907a3be071
+size 14244

checkpoint-1435/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c3b5c008aeab85448dc93b26ef48ef1b1487424fcaa42a2b4edbcc01433b9da0
+size 1064

checkpoint-1435/trainer_state.json ADDED Viewed

	@@ -0,0 +1,51 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 5.0,
+  "eval_steps": 500,
+  "global_step": 1435,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 1.0,
+      "learning_rate": 0.0004375,
+      "loss": 2.2956,
+      "step": 287
+    },
+    {
+      "epoch": 2.0,
+      "learning_rate": 0.000375,
+      "loss": 2.2508,
+      "step": 574
+    },
+    {
+      "epoch": 3.0,
+      "learning_rate": 0.0003125,
+      "loss": 2.223,
+      "step": 861
+    },
+    {
+      "epoch": 4.0,
+      "learning_rate": 0.00025,
+      "loss": 2.1849,
+      "step": 1148
+    },
+    {
+      "epoch": 5.0,
+      "learning_rate": 0.0001875,
+      "loss": 2.129,
+      "step": 1435
+    }
+  ],
+  "logging_steps": 500,
+  "max_steps": 2296,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 8,
+  "save_steps": 500,
+  "total_flos": 0.0,
+  "train_batch_size": 64,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-1435/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:59691d62067580cb96432d7835b9a22fa0cdbf9d683d1ce4d96a99344613e85b
+size 4792

checkpoint-1722/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c8f67022add8bdf2b50f0f9728abf4878d9769cf37b6c527772b38f276f87ba
+size 90866120

checkpoint-1722/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ab2d2ef97ecad6c8c0dbd0832adfc3310b8ac297d3d71519d4812ce989d7d08a
+size 180607738

checkpoint-1722/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6410d8133abaf4a88621d2bcb3b13abb0ad5ba2fb8420e82b82efbc22b99f0a5
+size 14244

checkpoint-1722/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ade3fcc8dc579752118cf26ed7f71bb709b1db10d0dc00e45e62e34e19de1862
+size 1064

checkpoint-1722/trainer_state.json ADDED Viewed

	@@ -0,0 +1,57 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 6.0,
+  "eval_steps": 500,
+  "global_step": 1722,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 1.0,
+      "learning_rate": 0.0004375,
+      "loss": 2.2956,
+      "step": 287
+    },
+    {
+      "epoch": 2.0,
+      "learning_rate": 0.000375,
+      "loss": 2.2508,
+      "step": 574
+    },
+    {
+      "epoch": 3.0,
+      "learning_rate": 0.0003125,
+      "loss": 2.223,
+      "step": 861
+    },
+    {
+      "epoch": 4.0,
+      "learning_rate": 0.00025,
+      "loss": 2.1849,
+      "step": 1148
+    },
+    {
+      "epoch": 5.0,
+      "learning_rate": 0.0001875,
+      "loss": 2.129,
+      "step": 1435
+    },
+    {
+      "epoch": 6.0,
+      "learning_rate": 0.000125,
+      "loss": 2.0923,
+      "step": 1722
+    }
+  ],
+  "logging_steps": 500,
+  "max_steps": 2296,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 8,
+  "save_steps": 500,
+  "total_flos": 0.0,
+  "train_batch_size": 64,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-1722/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:59691d62067580cb96432d7835b9a22fa0cdbf9d683d1ce4d96a99344613e85b
+size 4792

checkpoint-2009/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:404ec8daad9aaa3bbf36ac0b5ae66ebc740e09b008b59e7b546f4d65846cf0c8
+size 90866120

checkpoint-2009/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7155b933ef2854d6e51dc2007a87d953256098b5a55202a2f7b38594c7c1489c
+size 180607738

checkpoint-2009/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8e5a38071360faecf02821c8f4fbb614f6435fe31e87d746da0bc03e552b342c
+size 14244

checkpoint-2009/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4cee79f61c3bb2fb8a9032e1977574f64f859c6dd73869c3b327fbffa6589172
+size 1064

checkpoint-2009/trainer_state.json ADDED Viewed

	@@ -0,0 +1,63 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 7.0,
+  "eval_steps": 500,
+  "global_step": 2009,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 1.0,
+      "learning_rate": 0.0004375,
+      "loss": 2.2956,
+      "step": 287
+    },
+    {
+      "epoch": 2.0,
+      "learning_rate": 0.000375,
+      "loss": 2.2508,
+      "step": 574
+    },
+    {
+      "epoch": 3.0,
+      "learning_rate": 0.0003125,
+      "loss": 2.223,
+      "step": 861
+    },
+    {
+      "epoch": 4.0,
+      "learning_rate": 0.00025,
+      "loss": 2.1849,
+      "step": 1148
+    },
+    {
+      "epoch": 5.0,
+      "learning_rate": 0.0001875,
+      "loss": 2.129,
+      "step": 1435
+    },
+    {
+      "epoch": 6.0,
+      "learning_rate": 0.000125,
+      "loss": 2.0923,
+      "step": 1722
+    },
+    {
+      "epoch": 7.0,
+      "learning_rate": 6.25e-05,
+      "loss": 2.0515,
+      "step": 2009
+    }
+  ],
+  "logging_steps": 500,
+  "max_steps": 2296,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 8,
+  "save_steps": 500,
+  "total_flos": 0.0,
+  "train_batch_size": 64,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-2009/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:59691d62067580cb96432d7835b9a22fa0cdbf9d683d1ce4d96a99344613e85b
+size 4792

checkpoint-2296/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e26c278c82dfef249a456ec36e221a988b6d68d362fee2cb8d267663be9a8839
+size 90866120

checkpoint-2296/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:321acd24548f9e1e302833e2279fa7442bf21c987d8e771bd5c094247b334456
+size 180607738

checkpoint-2296/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6089dac575b53bd88bb39a67970f95bf63a880871d9872b7a41372a647ff1653
+size 14244

checkpoint-2296/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d1d06d8c680538100d481f36453ab9a97fc9432e250541c38cdeebf478aa981f
+size 1064

checkpoint-2296/trainer_state.json ADDED Viewed

	@@ -0,0 +1,69 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 8.0,
+  "eval_steps": 500,
+  "global_step": 2296,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 1.0,
+      "learning_rate": 0.0004375,
+      "loss": 2.2956,
+      "step": 287
+    },
+    {
+      "epoch": 2.0,
+      "learning_rate": 0.000375,
+      "loss": 2.2508,
+      "step": 574
+    },
+    {
+      "epoch": 3.0,
+      "learning_rate": 0.0003125,
+      "loss": 2.223,
+      "step": 861
+    },
+    {
+      "epoch": 4.0,
+      "learning_rate": 0.00025,
+      "loss": 2.1849,
+      "step": 1148
+    },
+    {
+      "epoch": 5.0,
+      "learning_rate": 0.0001875,
+      "loss": 2.129,
+      "step": 1435
+    },
+    {
+      "epoch": 6.0,
+      "learning_rate": 0.000125,
+      "loss": 2.0923,
+      "step": 1722
+    },
+    {
+      "epoch": 7.0,
+      "learning_rate": 6.25e-05,
+      "loss": 2.0515,
+      "step": 2009
+    },
+    {
+      "epoch": 8.0,
+      "learning_rate": 0.0,
+      "loss": 2.027,
+      "step": 2296
+    }
+  ],
+  "logging_steps": 500,
+  "max_steps": 2296,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 8,
+  "save_steps": 500,
+  "total_flos": 0.0,
+  "train_batch_size": 64,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-2296/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:59691d62067580cb96432d7835b9a22fa0cdbf9d683d1ce4d96a99344613e85b
+size 4792

checkpoint-287/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:39da78be7e3a484ab8795fb515ec18f0be4754c5cab212996559e1a53ee32de6
+size 90866120

checkpoint-287/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c0a5b06cb0024bd0156188d29444929a1dc946adcc7c294a6fd797e1254b8e5
+size 180607738

checkpoint-287/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d95ef98f5cd4b6c4540739b122435f8121ce1d964466889af8f8484ca6504e2c
+size 14244

checkpoint-287/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ddfeeb2c40d442d9cf6870b8d5c753e579d58c73dd822f86ccebebc7298add3a
+size 1064

checkpoint-287/trainer_state.json ADDED Viewed

	@@ -0,0 +1,27 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 1.0,
+  "eval_steps": 500,
+  "global_step": 287,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 1.0,
+      "learning_rate": 0.0004375,
+      "loss": 2.2956,
+      "step": 287
+    }
+  ],
+  "logging_steps": 500,
+  "max_steps": 2296,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 8,
+  "save_steps": 500,
+  "total_flos": 0.0,
+  "train_batch_size": 64,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-287/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:59691d62067580cb96432d7835b9a22fa0cdbf9d683d1ce4d96a99344613e85b
+size 4792

checkpoint-574/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:349a2c2e0c215111818507eb78aa0b183262faa7649367608bc67e77ea5d2218
+size 90866120

checkpoint-574/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e5c92b4b4b84841ca6f94381ed19e37081521ad086f7346a6344212e73439111
+size 180607738

checkpoint-574/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b8a4c4c8306949aa90fda7de408dd3be7bad8cfdb6a96376be8e42b698c0dcec
+size 14244

checkpoint-574/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2053df1c10f5c000fa932bfa8f8a1e64f92dd69d5c946f45ddb7e721b17ac683
+size 1064

checkpoint-574/trainer_state.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 2.0,
+  "eval_steps": 500,
+  "global_step": 574,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 1.0,
+      "learning_rate": 0.0004375,
+      "loss": 2.2956,
+      "step": 287
+    },
+    {
+      "epoch": 2.0,
+      "learning_rate": 0.000375,
+      "loss": 2.2508,
+      "step": 574
+    }
+  ],
+  "logging_steps": 500,
+  "max_steps": 2296,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 8,
+  "save_steps": 500,
+  "total_flos": 0.0,
+  "train_batch_size": 64,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-574/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:59691d62067580cb96432d7835b9a22fa0cdbf9d683d1ce4d96a99344613e85b
+size 4792

checkpoint-8323/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8eb2a5db42fe1825e43b69b178c42992e7d87576776e33758f75592acf8c1f89
+size 90866120

checkpoint-8323/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2b94a49c3e5c9b3b80c1f296a41d62252434fdc86628b024267ec35270194497
+size 180607738

checkpoint-8323/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:113c7031f546d1e57f4645de606e8624d51751acbde70de8fdcf580b016726fa
+size 14244

checkpoint-8323/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:003133324f971f67e818819da368b5822c56b1e11457a78920cf74596a8a62a2
+size 1064

checkpoint-8323/trainer_state.json ADDED Viewed

	@@ -0,0 +1,195 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 29.0,
+  "eval_steps": 500,
+  "global_step": 8323,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 1.0,
+      "learning_rate": 0.000484375,
+      "loss": 3.0823,
+      "step": 287
+    },
+    {
+      "epoch": 2.0,
+      "learning_rate": 0.00046875,
+      "loss": 2.7242,
+      "step": 574
+    },
+    {
+      "epoch": 3.0,
+      "learning_rate": 0.000453125,
+      "loss": 2.5348,
+      "step": 861
+    },
+    {
+      "epoch": 4.0,
+      "learning_rate": 0.0004375,
+      "loss": 2.4455,
+      "step": 1148
+    },
+    {
+      "epoch": 5.0,
+      "learning_rate": 0.000421875,
+      "loss": 2.3794,
+      "step": 1435
+    },
+    {
+      "epoch": 6.0,
+      "learning_rate": 0.00040625000000000004,
+      "loss": 2.3375,
+      "step": 1722
+    },
+    {
+      "epoch": 7.0,
+      "learning_rate": 0.000390625,
+      "loss": 2.3262,
+      "step": 2009
+    },
+    {
+      "epoch": 8.0,
+      "learning_rate": 0.000375,
+      "loss": 2.3114,
+      "step": 2296
+    },
+    {
+      "epoch": 9.0,
+      "learning_rate": 0.000359375,
+      "loss": 2.2921,
+      "step": 2583
+    },
+    {
+      "epoch": 10.0,
+      "learning_rate": 0.00034375,
+      "loss": 2.2918,
+      "step": 2870
+    },
+    {
+      "epoch": 11.0,
+      "learning_rate": 0.000328125,
+      "loss": 2.2578,
+      "step": 3157
+    },
+    {
+      "epoch": 12.0,
+      "learning_rate": 0.0003125,
+      "loss": 2.2693,
+      "step": 3444
+    },
+    {
+      "epoch": 13.0,
+      "learning_rate": 0.000296875,
+      "loss": 2.2594,
+      "step": 3731
+    },
+    {
+      "epoch": 14.0,
+      "learning_rate": 0.00028125000000000003,
+      "loss": 2.2555,
+      "step": 4018
+    },
+    {
+      "epoch": 15.0,
+      "learning_rate": 0.000265625,
+      "loss": 2.2481,
+      "step": 4305
+    },
+    {
+      "epoch": 16.0,
+      "learning_rate": 0.00025,
+      "loss": 2.2468,
+      "step": 4592
+    },
+    {
+      "epoch": 17.0,
+      "learning_rate": 0.000234375,
+      "loss": 2.248,
+      "step": 4879
+    },
+    {
+      "epoch": 18.0,
+      "learning_rate": 0.00021875,
+      "loss": 2.2435,
+      "step": 5166
+    },
+    {
+      "epoch": 19.0,
+      "learning_rate": 0.00020312500000000002,
+      "loss": 2.2319,
+      "step": 5453
+    },
+    {
+      "epoch": 20.0,
+      "learning_rate": 0.0001875,
+      "loss": 2.2303,
+      "step": 5740
+    },
+    {
+      "epoch": 21.0,
+      "learning_rate": 0.000171875,
+      "loss": 2.2215,
+      "step": 6027
+    },
+    {
+      "epoch": 22.0,
+      "learning_rate": 0.00015625,
+      "loss": 2.2256,
+      "step": 6314
+    },
+    {
+      "epoch": 23.0,
+      "learning_rate": 0.00014062500000000002,
+      "loss": 2.2257,
+      "step": 6601
+    },
+    {
+      "epoch": 24.0,
+      "learning_rate": 0.000125,
+      "loss": 2.2275,
+      "step": 6888
+    },
+    {
+      "epoch": 25.0,
+      "learning_rate": 0.000109375,
+      "loss": 2.2225,
+      "step": 7175
+    },
+    {
+      "epoch": 26.0,
+      "learning_rate": 9.375e-05,
+      "loss": 2.2166,
+      "step": 7462
+    },
+    {
+      "epoch": 27.0,
+      "learning_rate": 7.8125e-05,
+      "loss": 2.2174,
+      "step": 7749
+    },
+    {
+      "epoch": 28.0,
+      "learning_rate": 6.25e-05,
+      "loss": 2.2188,
+      "step": 8036
+    },
+    {
+      "epoch": 29.0,
+      "learning_rate": 4.6875e-05,
+      "loss": 2.2143,
+      "step": 8323
+    }
+  ],
+  "logging_steps": 500,
+  "max_steps": 9184,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 32,
+  "save_steps": 500,
+  "total_flos": 0.0,
+  "train_batch_size": 64,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-8323/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1e68121a9357c4f016eb6bc0f031c8d8d3f664e26a8b5ed965be82c62d99c0bf
+size 4792