Text Generation
Transformers
Safetensors
English
stablelm
conversational
Inference Endpoints
euclaise commited on
Commit
c404867
·
verified ·
1 Parent(s): b57c984

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -87,4 +87,9 @@ Keeping this in mind:
87
  - The exact answer is always important and is always a few tokens. Hence, we do not mask the labels or input tokens for the answer value.
88
  - Rarely, we ignore the rationale labels entirely, such that the model is only pushed to learn what leads to the best answer.
89
 
90
- ## Results
 
 
 
 
 
 
87
  - The exact answer is always important and is always a few tokens. Hence, we do not mask the labels or input tokens for the answer value.
88
  - Rarely, we ignore the rationale labels entirely, such that the model is only pushed to learn what leads to the best answer.
89
 
90
+ ## Results
91
+
92
+ I trained StableLM-3B-4e1t repeatedly on [https://huggingface.co/datasets/euclaise/TinyCoT](TinyCoT), along with 1000 examples from [reddit-instruct-curated](https://huggingface.co/datasets/euclaise/reddit-instruct-curated) and 1000 examples from [oasst2-curated](https://huggingface.co/datasets/sablo/oasst2_curated).
93
+
94
+ I trained once with ReMask (ReMask-CoT for CoT examples), once with Masked Thought (w/ partial label-masking), and once with SFT.
95
+