Text Generation
Transformers
Safetensors
mistral
text-generation-inference
Inference Endpoints
DarwinAnim8or commited on
Commit
de4059b
1 Parent(s): 754751b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -5,7 +5,7 @@ datasets:
5
  ---
6
 
7
  # Bamboo 400M
8
- This is a WIP model trained only on public domain (CC0) datasets, primarily in the English language.
9
  Further training is planned & ongoing, but currently no multi-language datasets are in use or planned; though this may change in the future and the current datasets *can* contain languages other than English.
10
 
11
  ## License
@@ -14,7 +14,7 @@ Though the training data of this model is CC0, the model itself is not. The mode
14
  ## Planned updates
15
  As mentioned, a few updates are planned:
16
  * Further training on more CC0 data, this model's weights will be updated as we pretrain on more of the listed datasets.
17
- * Experiment with exteding the context length using YaRN to 32k tokens.
18
  * Fine-tuning the resulting model for instruct, code and storywriting. These will then be combined using MergeKit to create a MoE model.
19
  * Release a GGUF version and an extended context version of the base model
20
 
@@ -27,7 +27,7 @@ This table tracks the performance of our model on various tasks over time.
27
  | 2024-07-27 | acc | 27.40% ± 0.92% | 25.52% ± 0.44% | 52.71% ± 3.01% | 39.52% ± 1.11% | 36.29% |
28
 
29
  ## Legend
30
- - Date: The date of each evaluation run
31
  - Metric: The evaluation metric used (acc = accuracy)
32
  - Task columns: Results for each task in the format "Percentage ± Standard Error"
33
 
 
5
  ---
6
 
7
  # Bamboo 400M
8
+ This is a WIP foundational (aka base) model trained only on public domain (CC0) datasets, primarily in the English language.
9
  Further training is planned & ongoing, but currently no multi-language datasets are in use or planned; though this may change in the future and the current datasets *can* contain languages other than English.
10
 
11
  ## License
 
14
  ## Planned updates
15
  As mentioned, a few updates are planned:
16
  * Further training on more CC0 data, this model's weights will be updated as we pretrain on more of the listed datasets.
17
+ * Experiment with extending the context length using YaRN to 32k tokens.
18
  * Fine-tuning the resulting model for instruct, code and storywriting. These will then be combined using MergeKit to create a MoE model.
19
  * Release a GGUF version and an extended context version of the base model
20
 
 
27
  | 2024-07-27 | acc | 27.40% ± 0.92% | 25.52% ± 0.44% | 52.71% ± 3.01% | 39.52% ± 1.11% | 36.29% |
28
 
29
  ## Legend
30
+ - Date: The date of the model that the evaluation was run on. Pretraining is ongoing and tests are re-run with that date's model.
31
  - Metric: The evaluation metric used (acc = accuracy)
32
  - Task columns: Results for each task in the format "Percentage ± Standard Error"
33