Update README.md
Browse files
README.md
CHANGED
@@ -105,6 +105,7 @@ python convert-hf-to-gguf.py Mambaoutai
|
|
105 |
### Training Hardware
|
106 |
|
107 |
The model checkpoints with no instruction data have been fully trained on an NVIDIA DGX H100 provided by OVH Cloud, whereas the decay phases with instruction data have been carried out on an HPE Cray with 8xH100 on Orange Cloud Avenue.
|
|
|
108 |
|
109 |
### Model hyperparameters
|
110 |
|
|
|
105 |
### Training Hardware
|
106 |
|
107 |
The model checkpoints with no instruction data have been fully trained on an NVIDIA DGX H100 provided by OVH Cloud, whereas the decay phases with instruction data have been carried out on an HPE Cray with 8xH100 on Orange Cloud Avenue.
|
108 |
+
The ablation experiments were conducted on 16 nodes(4xA100-40GB) on MeluXina.
|
109 |
|
110 |
### Model hyperparameters
|
111 |
|