Update README.md
Browse files
README.md
CHANGED
|
@@ -54,7 +54,8 @@ More information needed
|
|
| 54 |
- High-quality reasoning dataset from private documents, QAs generated by Claude AI (1.3k samples)
|
| 55 |
- EverythingLM-v2 (0.9k samples)
|
| 56 |
- KoCoT (2k samples)
|
| 57 |
-
- Private MRC dataset - answer generated by GPT-4 (
|
|
|
|
| 58 |
|
| 59 |
## Training procedure
|
| 60 |
|
|
|
|
| 54 |
- High-quality reasoning dataset from private documents, QAs generated by Claude AI (1.3k samples)
|
| 55 |
- EverythingLM-v2 (0.9k samples)
|
| 56 |
- KoCoT (2k samples)
|
| 57 |
+
- Private MRC dataset - answer generated by GPT-4 (32k samples)
|
| 58 |
+
Original data have ~12k question-answer pairs with context, and augmentation is applied to make 20k samples with triplet contexts case (1 correct context out of 3)
|
| 59 |
|
| 60 |
## Training procedure
|
| 61 |
|