nouamanetazi HF staff commited on
Commit
5d8e8eb
1 Parent(s): d673f11

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -12,8 +12,17 @@ Modeling code for Mistral to use with [Nanotron](https://github.com/huggingface/
12
  # Generate a config file
13
  python config_tiny_mistral.py
14
 
15
-
16
  # Run training
17
  export CUDA_DEVICE_MAX_CONNECTIONS=1 # important for some distributed operations
18
  torchrun --nproc_per_node=8 run_train.py --config-file config_tiny_mistral.yaml
19
  ```
 
 
 
 
 
 
 
 
 
 
 
12
  # Generate a config file
13
  python config_tiny_mistral.py
14
 
 
15
  # Run training
16
  export CUDA_DEVICE_MAX_CONNECTIONS=1 # important for some distributed operations
17
  torchrun --nproc_per_node=8 run_train.py --config-file config_tiny_mistral.yaml
18
  ```
19
+
20
+ ## 🚀 Use your custom model
21
+
22
+ - Update the `MistralConfig` class in `config_tiny_mistral.py` to match your model's configuration
23
+ - Update the `MistralForTraining` class in `modeling_mistral.py` to match your model's architecture
24
+ - Pass the previous to the `DistributedTrainer` class in `run_train.py`:
25
+ ```python
26
+ trainer = DistributedTrainer(config_file, model_class=MistralForTraining, model_config_class=MistralConfig)
27
+ ```
28
+ - Run training as usual