Update README.md
Browse files
README.md
CHANGED
@@ -63,7 +63,7 @@ base_model: /workspace/models/Mistral-Small-Instruct-2409
|
|
63 |
model_type: AutoModelForCausalLM
|
64 |
tokenizer_type: AutoTokenizer
|
65 |
|
66 |
-
hub_model_id: anthracite-
|
67 |
hub_strategy: "all_checkpoints"
|
68 |
push_dataset_to_hub:
|
69 |
hf_use_auth_token: true
|
@@ -81,17 +81,17 @@ load_in_4bit: false
|
|
81 |
strict: false
|
82 |
|
83 |
datasets:
|
84 |
-
- path: anthracite-
|
85 |
type: custommistralv2v3
|
86 |
-
- path: anthracite-
|
87 |
type: custommistralv2v3
|
88 |
-
- path: anthracite-
|
89 |
type: custommistralv2v3
|
90 |
- path: anthracite-org/nopm_claude_writing_fixed
|
91 |
type: custommistralv2v3
|
92 |
-
- path: anthracite-
|
93 |
type: custommistralv2v3
|
94 |
-
- path: anthracite-
|
95 |
type: custommistralv2v3
|
96 |
#chat_template: mistral_v2v3
|
97 |
shuffle_merged_datasets: true
|
@@ -159,12 +159,12 @@ We'd like to thank Recursal / Featherless for sponsoring the compute for this tr
|
|
159 |
We would also like to thank all members of Anthracite who made this finetune possible.
|
160 |
|
161 |
## Datasets
|
162 |
-
- [anthracite-
|
163 |
-
- [anthracite-
|
164 |
-
- [anthracite-
|
165 |
- [anthracite-org/nopm_claude_writing_fixed](https://huggingface.co/datasets/anthracite-org/nopm_claude_writing_fixed)
|
166 |
-
- [anthracite-
|
167 |
-
- [anthracite-
|
168 |
|
169 |
## Training
|
170 |
The training was done for 2 epochs. We used 8x[H100s](https://www.nvidia.com/en-us/data-center/h100/) GPUs graciously provided by [Recursal AI](https://recursal.ai/) / [Featherless AI](https://featherless.ai/) for the full-parameter fine-tuning of the model.
|
|
|
63 |
model_type: AutoModelForCausalLM
|
64 |
tokenizer_type: AutoTokenizer
|
65 |
|
66 |
+
hub_model_id: anthracite-org/magnum-v4-22b-r4
|
67 |
hub_strategy: "all_checkpoints"
|
68 |
push_dataset_to_hub:
|
69 |
hf_use_auth_token: true
|
|
|
81 |
strict: false
|
82 |
|
83 |
datasets:
|
84 |
+
- path: anthracite-org/c2_logs_32k_mistral-v3_v1.2_no_system
|
85 |
type: custommistralv2v3
|
86 |
+
- path: anthracite-org/kalo-opus-instruct-22k-no-refusal-no-system
|
87 |
type: custommistralv2v3
|
88 |
+
- path: anthracite-org/kalo-opus-instruct-3k-filtered-no-system
|
89 |
type: custommistralv2v3
|
90 |
- path: anthracite-org/nopm_claude_writing_fixed
|
91 |
type: custommistralv2v3
|
92 |
+
- path: anthracite-org/kalo_opus_misc_240827_no_system
|
93 |
type: custommistralv2v3
|
94 |
+
- path: anthracite-org/kalo_misc_part2_no_system
|
95 |
type: custommistralv2v3
|
96 |
#chat_template: mistral_v2v3
|
97 |
shuffle_merged_datasets: true
|
|
|
159 |
We would also like to thank all members of Anthracite who made this finetune possible.
|
160 |
|
161 |
## Datasets
|
162 |
+
- [anthracite-org/c2_logs_32k_mistral-v3_v1.2_no_system](https://huggingface.co/datasets/anthracite-org/c2_logs_32k_mistral-v3_v1.2_no_system)
|
163 |
+
- [anthracite-org/kalo-opus-instruct-22k-no-refusal-no-system](https://huggingface.co/datasets/anthracite-org/kalo-opus-instruct-22k-no-refusal-no-system)
|
164 |
+
- [anthracite-org/kalo-opus-instruct-3k-filtered-no-system](https://huggingface.co/datasets/anthracite-org/kalo-opus-instruct-3k-filtered-no-system)
|
165 |
- [anthracite-org/nopm_claude_writing_fixed](https://huggingface.co/datasets/anthracite-org/nopm_claude_writing_fixed)
|
166 |
+
- [anthracite-org/kalo_opus_misc_240827_no_system](https://huggingface.co/datasets/anthracite-org/kalo_opus_misc_240827_no_system)
|
167 |
+
- [anthracite-org/kalo_misc_part2_no_system](https://huggingface.co/datasets/anthracite-org/kalo_misc_part2_no_system)
|
168 |
|
169 |
## Training
|
170 |
The training was done for 2 epochs. We used 8x[H100s](https://www.nvidia.com/en-us/data-center/h100/) GPUs graciously provided by [Recursal AI](https://recursal.ai/) / [Featherless AI](https://featherless.ai/) for the full-parameter fine-tuning of the model.
|