--- library_name: transformers tags: - text-generation-inference license: mit language: - en --- # Model Card for Model ID ## Model Details ### Model Description Model Description: This model card presents details for the gpt2-xl model, a large autoregressive language model optimized for text generation tasks. The model uses the GPT-2 architecture developed by OpenAI. - **Model type:** Autoregressive Language Model - **Language(s) (NLP):** English] ## Uses ### Direct Use The model can be used for text generation tasks, such as completing sentences or generating coherent paragraphs. ## Bias, Risks, and Limitations The model may exhibit biases present in the training data and could generate inappropriate or sensitive content. Users should exercise caution when deploying the model in production. ### Recommendations Users should be aware of potential biases and limitations of the model, particularly when used in applications that involve sensitive or high-stakes content. ## How to Get Started with the Model Use the code below to get started with the model. import torch from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "gpt2-xl" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) input_txt = "Bananas are a great" input_ids = tokenizer(input_txt, return_tensors="pt")["input_ids"] output = model.generate(input_ids, max_length=200, do_sample=False) print(tokenizer.decode(output[0])) ## Training Details ### Training Data The model was trained on a diverse range of internet text, including news articles, books, and websites. #### Training Hyperparameters Training regime: Autoregressive training with large-scale language modeling objectives Compute infrastructure: GPUs (specific details not disclosed) ## Evaluation ### Testing Data, Factors & Metrics The model was evaluated on standard language modeling benchmarks, including perplexity scores on held-out data.