anezatra
/

gpt2-alpaca-355M

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gpt2-alpaca-355M / README.md

anezatra's picture

Update README.md

70412eb verified 7 months ago

|

history blame contribute delete

1.37 kB

	---
	license: mit
	tags:
	- generated_from_trainer
	model-index:
	- name: GPT2-Medium-Alpaca-355m
	results: []
	datasets:
	- tatsu-lab/alpaca
	widget:
	- text: \|-

	You are a chat bot that provides professional answers to questions asked

	### Instruction:
	What is the purpose of life

	### Response:
	language:
	- en
	pipeline_tag: text-generation
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# OpenAI GPT-2 355M

	## Model description

	This custom GPT-2 model is derived from the [gpt2-medium](https://huggingface.co/gpt2-medium) model and trained on the Alpaca dataset. Anezatra team meticulously trained this model on the Alpaca dataset for natural language processing tasks. The model excels in text generation and language understanding tasks, making it ideal for chat applications.

	## Training Procedure

	This model was trained with 4 x A100 GPUs

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 128
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.15
	- num_epochs: 1