Edit model card

Here I provide you with a completely un-trained, from scratch model of GPT2.

Which is the 124M parameter version.

This has had all of it's weights randomized and then saved wiping out all previous training.

It was trained for 50 epochs on the original book "Peter Pan" just so I can get the save and tokenization files to upload to hugging face.

So, it is surprisingly almost coherent if you test it to the right with the example text and pressing "compute" just a interesting side note.

What is this and how is it different? This is different than simply downloading a new 'gpt2' because all pre-training has been wiped out (except for the 50 epochs I mentioned).

WHY?! This allows you to train the model from scratch which leaves open more parameters for training specifically for your use-case!

You can see more examples on the original gpt model card page @ https://huggingface.co/gpt2

Example usage:

requirements: pip install transformers

from transformers import GPT2LMHeadModel, GPT2Tokenizer

Substitute 'your_model_name' with the name of your model

model_name_or_path = 'your_model_name'

Load pre-trained model tokenizer

tokenizer = GPT2Tokenizer.from_pretrained(model_name_or_path)

Load pre-trained model

model = GPT2LMHeadModel.from_pretrained(model_name_or_path)

Model input

input_text = "Hello, how are you?"

Encode input text

input_ids = tokenizer.encode(input_text, return_tensors='pt')

Generate output

output = model.generate(input_ids, max_length=50, num_return_sequences=1, temperature=0.7)

Decode output

decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)

print(decoded_output)

License: Apache 2.0 The Apache 2.0 license allows software developers to alter the source code of existing software's source code, copy the original source code or update the source code. Furthermore, developers can then distribute any copies or modifications that they make of the software's source code.

COMMERCIAL USE: YES PERSONAL USE: YES EDUCATIONAL USE: YES

Enjoy!

Downloads last month
23
Inference API
This model can be loaded on Inference API (serverless).