Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,7 @@ This data is from [13B-en training](https://github.com/bigscience-workshop/bigsc
|
|
2 |
|
3 |
|
4 |
|
5 |
-
- indices - these are shuffled indices that the training was using. They were generated the first time the training started. So the order is the same if one replays them via the dataloader w/o actually doing the training steps.
|
6 |
|
7 |
- the corresponding dataset is oscar-en that's on JZ at `$six_ALL_CCFRWORK/datasets-custom/oscar-en`
|
8 |
|
|
|
2 |
|
3 |
|
4 |
|
5 |
+
- indices - these are Megatron-LM shuffled indices that the training was using. They were generated the first time the training started. So the order is the same if one replays them via the dataloader w/o actually doing the training steps.
|
6 |
|
7 |
- the corresponding dataset is oscar-en that's on JZ at `$six_ALL_CCFRWORK/datasets-custom/oscar-en`
|
8 |
|