File size: 2,205 Bytes
20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee 20c7bef 559b5ee |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
---
library_name: transformers
tags:
- text-generation-inference
license: mit
language:
- en
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
## Model Details
### Model Description
Model Description:
This model card presents details for the gpt2-xl model, a large autoregressive language model optimized for text generation tasks. The model uses the GPT-2 architecture developed by OpenAI.
- **Model type:** Autoregressive Language Model
- **Language(s) (NLP):** English]
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
The model can be used for text generation tasks, such as completing sentences or generating coherent paragraphs.
## Bias, Risks, and Limitations
The model may exhibit biases present in the training data and could generate inappropriate or sensitive content. Users should exercise caution when deploying the model in production.
### Recommendations
Users should be aware of potential biases and limitations of the model, particularly when used in applications that involve sensitive or high-stakes content.
## How to Get Started with the Model
Use the code below to get started with the model.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "gpt2-xl"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_txt = "Bananas are a great"
input_ids = tokenizer(input_txt, return_tensors="pt")["input_ids"]
output = model.generate(input_ids, max_length=200, do_sample=False)
print(tokenizer.decode(output[0]))
## Training Details
### Training Data
The model was trained on a diverse range of internet text, including news articles, books, and websites.
#### Training Hyperparameters
Training regime: Autoregressive training with large-scale language modeling objectives
Compute infrastructure: GPUs (specific details not disclosed)
## Evaluation
### Testing Data, Factors & Metrics
The model was evaluated on standard language modeling benchmarks, including perplexity scores on held-out data. |