gpt-j-ggml / README.md
LLukas22's picture
Generated README.md
5e17023
|
raw
history blame
3.84 kB
metadata
language:
  - en
tags:
  - llm-rs
  - ggml
pipeline_tag: text-generation
datasets:
  - the_pile

GGML covnerted Models of EleutherAI's GPT-J model

Description

GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.

Hyperparameter Value
nparametersn_{parameters} 6053381344
nlayersn_{layers} 28*
dmodeld_{model} 4096
dffd_{ff} 16384
nheadsn_{heads} 16
dheadd_{head} 256
nctxn_{ctx} 2048
nvocabn_{vocab} 50257/50400† (same tokenizer as GPT-2/3)
Positional Encoding Rotary Position Embedding (RoPE)
RoPE Dimensions 64

* Each layer consists of one feedforward block and one self attention block.

† Although the embedding matrix has a size of 50400, only 50257 entries are used by the GPT-2 tokenizer.

The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to 64 dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of BPEs as GPT-2/GPT-3.

Converted Models

Name Based on Type Container GGML Version
gpt-j-6b-f16.bin EleutherAI/gpt-j-6b F16 GGML V3
gpt-j-6b-q4_0.bin EleutherAI/gpt-j-6b Q4_0 GGML V3
gpt-j-6b-q4_0-ggjt.bin EleutherAI/gpt-j-6b Q4_0 GGJT V3
gpt-j-6b-q5_1-ggjt.bin EleutherAI/gpt-j-6b Q5_1 GGJT V3

Usage

Python via llm-rs:

Installation

Via pip: pip install llm-rs

Run inference

from llm_rs import AutoModel

#Load the model, define any model you like from the list above as the `model_file`
model = AutoModel.from_pretrained("rustformers/gpt-j-ggml",model_file="gpt-j-6b-q4_0-ggjt.bin")

#Generate
print(model.generate("The meaning of life is"))

Rust via Rustformers/llm:

Installation

git clone --recurse-submodules https://github.com/rustformers/llm.git
cd llm
cargo build --release

Run inference

cargo run --release -- gptj infer -m path/to/model.bin  -p "Tell me how cool the Rust programming language is:"