XSF0528 commited on
Commit
aa94e84
1 Parent(s): 4422273

Model save

Browse files
Files changed (2) hide show
  1. README.md +3 -5
  2. generation_config.json +7 -0
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  library_name: transformers
3
  license: other
4
- base_model: facebook/opt-125m
5
  tags:
6
  - trl
7
  - sft
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # sft_openassistant-guanaco
18
 
19
- This model is a fine-tuned version of [facebook/opt-125m](https://huggingface.co/facebook/opt-125m) on an unknown dataset.
20
 
21
  ## Model description
22
 
@@ -40,10 +40,8 @@ The following hyperparameters were used during training:
40
  - eval_batch_size: 8
41
  - seed: 42
42
  - distributed_type: multi-GPU
43
- - num_devices: 4
44
  - gradient_accumulation_steps: 16
45
- - total_train_batch_size: 4096
46
- - total_eval_batch_size: 32
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: linear
49
  - num_epochs: 3.0
 
1
  ---
2
  library_name: transformers
3
  license: other
4
+ base_model: facebook/opt-350m
5
  tags:
6
  - trl
7
  - sft
 
16
 
17
  # sft_openassistant-guanaco
18
 
19
+ This model is a fine-tuned version of [facebook/opt-350m](https://huggingface.co/facebook/opt-350m) on an unknown dataset.
20
 
21
  ## Model description
22
 
 
40
  - eval_batch_size: 8
41
  - seed: 42
42
  - distributed_type: multi-GPU
 
43
  - gradient_accumulation_steps: 16
44
+ - total_train_batch_size: 1024
 
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: linear
47
  - num_epochs: 3.0
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 2,
4
+ "eos_token_id": 2,
5
+ "pad_token_id": 1,
6
+ "transformers_version": "4.44.2"
7
+ }