Toflamus
/

GPT-2_3M_finetuned2

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

GPT-2_3M_finetuned2 / README.md

Toflamus's picture

Toflamus/Finetuned_with_eval

5b1a5e8 11 months ago

|

history blame contribute delete

No virus

2.23 kB

	---
	license: mit
	base_model: Toflamus/GPT-2_para3M
	tags:
	- generated_from_trainer
	model-index:
	- name: Output
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Output

	This model is a fine-tuned version of [Toflamus/GPT-2_para3M](https://huggingface.co/Toflamus/GPT-2_para3M) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 5.9785

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 100
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 7.6797 \| 0.27 \| 100 \| 7.0355 \|
	\| 6.9842 \| 0.55 \| 200 \| 6.6754 \|
	\| 6.7517 \| 0.82 \| 300 \| 6.5074 \|
	\| 6.6145 \| 1.09 \| 400 \| 6.3942 \|
	\| 6.5294 \| 1.37 \| 500 \| 6.3043 \|
	\| 6.4228 \| 1.64 \| 600 \| 6.2332 \|
	\| 6.3582 \| 1.91 \| 700 \| 6.1772 \|
	\| 6.3 \| 2.19 \| 800 \| 6.1279 \|
	\| 6.2841 \| 2.46 \| 900 \| 6.0878 \|
	\| 6.2103 \| 2.73 \| 1000 \| 6.0572 \|
	\| 6.1908 \| 3.01 \| 1100 \| 6.0325 \|
	\| 6.1733 \| 3.28 \| 1200 \| 6.0132 \|
	\| 6.1383 \| 3.55 \| 1300 \| 5.9991 \|
	\| 6.149 \| 3.83 \| 1400 \| 5.9901 \|
	\| 6.1383 \| 4.1 \| 1500 \| 5.9836 \|
	\| 6.1155 \| 4.37 \| 1600 \| 5.9800 \|
	\| 6.1275 \| 4.65 \| 1700 \| 5.9788 \|
	\| 6.1257 \| 4.92 \| 1800 \| 5.9785 \|


	### Framework versions

	- Transformers 4.32.0
	- Pytorch 2.0.1+cu117
	- Datasets 2.14.4
	- Tokenizers 0.13.2