Quantization made by Richard Erkhov.
[Github](https://github.com/RichardErkhov)
[Discord](https://discord.gg/pvy7H8DZMG)
[Request more models](https://github.com/RichardErkhov/quant_request)
megatron-gpt2-345m - GGUF
- Model creator: https://huggingface.co/robowaifudev/
- Original model: https://huggingface.co/robowaifudev/megatron-gpt2-345m/
| Name | Quant method | Size |
| ---- | ---- | ---- |
| [megatron-gpt2-345m.Q2_K.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q2_K.gguf) | Q2_K | 0.17GB |
| [megatron-gpt2-345m.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.IQ3_XS.gguf) | IQ3_XS | 0.18GB |
| [megatron-gpt2-345m.IQ3_S.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.IQ3_S.gguf) | IQ3_S | 0.19GB |
| [megatron-gpt2-345m.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q3_K_S.gguf) | Q3_K_S | 0.19GB |
| [megatron-gpt2-345m.IQ3_M.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.IQ3_M.gguf) | IQ3_M | 0.2GB |
| [megatron-gpt2-345m.Q3_K.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q3_K.gguf) | Q3_K | 0.21GB |
| [megatron-gpt2-345m.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q3_K_M.gguf) | Q3_K_M | 0.21GB |
| [megatron-gpt2-345m.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q3_K_L.gguf) | Q3_K_L | 0.23GB |
| [megatron-gpt2-345m.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.IQ4_XS.gguf) | IQ4_XS | 0.22GB |
| [megatron-gpt2-345m.Q4_0.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q4_0.gguf) | Q4_0 | 0.23GB |
| [megatron-gpt2-345m.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.IQ4_NL.gguf) | IQ4_NL | 0.23GB |
| [megatron-gpt2-345m.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q4_K_S.gguf) | Q4_K_S | 0.23GB |
| [megatron-gpt2-345m.Q4_K.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q4_K.gguf) | Q4_K | 0.25GB |
| [megatron-gpt2-345m.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q4_K_M.gguf) | Q4_K_M | 0.25GB |
| [megatron-gpt2-345m.Q4_1.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q4_1.gguf) | Q4_1 | 0.25GB |
| [megatron-gpt2-345m.Q5_0.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q5_0.gguf) | Q5_0 | 0.27GB |
| [megatron-gpt2-345m.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q5_K_S.gguf) | Q5_K_S | 0.27GB |
| [megatron-gpt2-345m.Q5_K.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q5_K.gguf) | Q5_K | 0.29GB |
| [megatron-gpt2-345m.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q5_K_M.gguf) | Q5_K_M | 0.29GB |
| [megatron-gpt2-345m.Q5_1.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q5_1.gguf) | Q5_1 | 0.29GB |
| [megatron-gpt2-345m.Q6_K.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q6_K.gguf) | Q6_K | 0.32GB |
| [megatron-gpt2-345m.Q8_0.gguf](https://huggingface.co/RichardErkhov/robowaifudev_-_megatron-gpt2-345m-gguf/blob/main/megatron-gpt2-345m.Q8_0.gguf) | Q8_0 | 0.41GB |
Original model description:
---
language:
- en
tags:
- gpt2
license: apache-2.0
widget:
- text: It was a bright cold day in April, and the clocks were striking thirteen. Winston Smith,
datasets:
- wikitext
- openwebtext
- spacemanidol/cc-stories
model-index:
- name: megatron-gpt2-345m
results:
- task:
type: text-generation
name: Text generation
dataset:
name: WikiText-103
type: wikitext
metrics:
- type: wikitext
value: 19.31
name: Perplexity
- task:
type: text-generation
name: Text generation
dataset:
name: WikiText-2
type: wikitext
metrics:
- type: wikitext
value: 17.151
name: Perplexity
- task:
type: text-generation
name: Text generation
dataset:
name: LAMBADA
type: lambada
metrics:
- type: lambada
value: 5.509
name: Perplexity
- type: lambada
value: 68.31%
name: Accuracy
---
This is an archive of [nvidia/megatron-gpt2-345m](https://huggingface.co/nvidia/megatron-gpt2-345m) that contains readily available model weights (375M). Its performance on Wikitext-103 is 19.31.1 In comparison, the performance of GPT2-large (1.5B) is 17.48 and GPT2-medium (762M) is 22.05.2
### References
1. Shoeybi, Mohammad, et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. arXiv, 2019, [https://doi.org/10.48550/ARXIV.1909.08053](https://doi.org/10.48550/ARXIV.1909.08053).
2. Alec Radford, et al. Language Models are Unsupervised Multitask Learners. OpenAI, 2019. [https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf).
## Description
[Megatron](https://arxiv.org/pdf/1909.08053.pdf) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This particular Megatron model was trained from a generative, left-to-right transformer in the style of GPT-2. This model was trained on text sourced from Wikipedia, RealNews, OpenWebText, and CC-Stories. It contains 345 million parameters.
Find more information at [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
# How to run Megatron GPT2 using Transformers
## Text generation
The following code shows how to use the Megatron GPT2 checkpoint and Transformers to generate text.
```python
import os
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("robowaifudev/megatron-gpt2-345m")
if torch.cuda.is_available():
device = torch.device("cuda")
model.half()
else:
device = torch.device("cpu")
model.to(device)
model.eval()
# Generate
prompt = (
"It was a bright cold day in April, and the clocks were striking thirteen. Winston Smith,"
)
input_ids = tokenizer.encode(prompt, return_tensors="pt").to(device)
output = model.generate(
input_ids=input_ids,
max_length=len(input_ids) + 128,
do_sample=True,
top_k=64,
top_p=0.9,
temperature=0.8,
num_return_sequences=2,
repetition_penalty=1.025
)
# Output the text
print("Prompt:", prompt)
print("*" * 3)
for i, sentence in enumerate(output):
text = tokenizer.decode(sentence, clean_up_tokenization_spaces=True)
print(f"{i}:", text)
print("*" * 3)
```
# Original code
The original Megatron code can be found here: [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM).