metadata

language:
  - en
tags:
  - llm-rs
  - ggml
pipeline_tag: text-generation
datasets:
  - the_pile

GGML converted versions of EleutherAI's GPT-J model

Description

GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.

Hyperparameter	Value
$n_{parameters}$	6053381344
$n_{layers}$	28*
$d_{model}$	4096
$d_{ff}$	16384
$n_{heads}$	16
$d_{head}$	256
$n_{ctx}$	2048
$n_{vocab}$	50257/50400† (same tokenizer as GPT-2/3)
Positional Encoding	Rotary Position Embedding (RoPE)
RoPE Dimensions	64

* Each layer consists of one feedforward block and one self attention block.

† Although the embedding matrix has a size of 50400, only 50257 entries are used by the GPT-2 tokenizer.

The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to 64 dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of BPEs as GPT-2/GPT-3.

Converted Models

$MODELS$

Usage

Python via llm-rs:

Installation

Via pip: pip install llm-rs

Run inference

from llm_rs import AutoModel

#Load the model, define any model you like from the list above as the `model_file`
model = AutoModel.from_pretrained("rustformers/gpt-j-ggml",model_file="gpt-j-6b-q4_0-ggjt.bin")

#Generate
print(model.generate("The meaning of life is"))

Rust via Rustformers/llm:

Installation

git clone --recurse-submodules https://github.com/rustformers/llm.git
cd llm
cargo build --release

Run inference

cargo run --release -- gptj infer -m path/to/model.bin  -p "Tell me how cool the Rust programming language is:"