XSF0528
/

sft_openassistant-guanaco

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

XSF0528 commited on Sep 2

Commit

aa94e84

•

1 Parent(s): 4422273

Model save

Files changed (2) hide show

README.md +3 -5
generation_config.json +7 -0

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 library_name: transformers
 license: other
-base_model: facebook/opt-125m
 tags:
 - trl
 - sft
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 # sft_openassistant-guanaco
-This model is a fine-tuned version of [facebook/opt-125m](https://huggingface.co/facebook/opt-125m) on an unknown dataset.
 ## Model description
@@ -40,10 +40,8 @@ The following hyperparameters were used during training:
 - eval_batch_size: 8
 - seed: 42
 - distributed_type: multi-GPU
-- num_devices: 4
 - gradient_accumulation_steps: 16
-- total_train_batch_size: 4096
-- total_eval_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 3.0

 ---
 library_name: transformers
 license: other
+base_model: facebook/opt-350m
 tags:
 - trl
 - sft
 # sft_openassistant-guanaco
+This model is a fine-tuned version of [facebook/opt-350m](https://huggingface.co/facebook/opt-350m) on an unknown dataset.
 ## Model description
 - eval_batch_size: 8
 - seed: 42
 - distributed_type: multi-GPU
 - gradient_accumulation_steps: 16
+- total_train_batch_size: 1024
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 3.0

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 2,
+  "eos_token_id": 2,
+  "pad_token_id": 1,
+  "transformers_version": "4.44.2"
+}