--- license: apache-2.0 --- Here I provide you with a completely un-trained, from scratch model of GPT2. Which is the 124M parameter version. This has had all of it's weights randomized and then saved wiping out all previous training. It was trained for 50 epochs on the original book "Peter Pan" just so I can get the save and tokenization files to upload to hugging face. So, it is surprisingly almost coherent if you test it to the right with the example text and pressing "compute" just a interesting side note. What is this and how is it different? This is different than simply downloading a new 'gpt2' because all pre-training has been wiped out (except for the 50 epochs I mentioned). WHY?! This allows you to train the model from scratch which leaves open more parameters for training specifically for your use-case! You can see more examples on the original gpt model card page @ https://huggingface.co/gpt2 Example usage: requirements: pip install transformers from transformers import GPT2LMHeadModel, GPT2Tokenizer # Substitute 'your_model_name' with the name of your model model_name_or_path = 'your_model_name' # Load pre-trained model tokenizer tokenizer = GPT2Tokenizer.from_pretrained(model_name_or_path) # Load pre-trained model model = GPT2LMHeadModel.from_pretrained(model_name_or_path) # Model input input_text = "Hello, how are you?" # Encode input text input_ids = tokenizer.encode(input_text, return_tensors='pt') # Generate output output = model.generate(input_ids, max_length=50, num_return_sequences=1, temperature=0.7) # Decode output decoded_output = tokenizer.decode(output[0], skip_special_tokens=True) print(decoded_output) License: Apache 2.0 The Apache 2.0 license allows software developers to alter the source code of existing software's source code, copy the original source code or update the source code. Furthermore, developers can then distribute any copies or modifications that they make of the software's source code. COMMERCIAL USE: YES PERSONAL USE: YES EDUCATIONAL USE: YES Enjoy!