rustformers
/

gpt-j-ggml

@@ -1,77 +1,41 @@
 ---
-license: bigscience-bloom-rail-1.0
 language:
-- ak
-- ar
-- as
-- bm
-- bn
-- ca
-- code
 - en
-- es
-- eu
-- fon
-- fr
-- gu
-- hi
-- id
-- ig
-- ki
-- kn
-- lg
-- ln
-- ml
-- mr
-- ne
-- nso
-- ny
-- or
-- pa
-- pt
-- rn
-- rw
-- sn
-- st
-- sw
-- ta
-- te
-- tn
-- ts
-- tum
-- tw
-- ur
-- vi
-- wo
-- xh
-- yo
-- zh
-- zu
-programming_language:
-- C
-- C++
-- C#
-- Go
-- Java
-- JavaScript
-- Lua
-- PHP
-- Python
-- Ruby
-- Rust
-- Scala
-- TypeScript
 tags:
 - llm-rs
 - ggml
 pipeline_tag: text-generation
 ---
-# GGML covnerted Models of [BigScience](https://huggingface.co/bigscience)'s Bloom models
 ## Description
-BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from text written by humans. BLOOM can also be instructed to perform text tasks it hasn't been explicitly trained for, by casting them as text generation tasks.
 ## Converted Models
@@ -89,7 +53,7 @@ Via pip: `pip install llm-rs`
 from llm_rs import AutoModel
 #Load the model, define any model you like from the list above as the `model_file`
-model = AutoModel.from_pretrained("rustformers/bloom-ggml",model_file="bloom-3b-q4_0-ggjt.bin")
 #Generate
 print(model.generate("The meaning of life is"))
@@ -106,5 +70,5 @@ cargo build --release
 #### Run inference
 ```
-cargo run --release -- bloom infer -m path/to/model.bin  -p "Tell me how cool the Rust programming language is:"
 ```

 ---
 language:
 - en
 tags:
 - llm-rs
 - ggml
 pipeline_tag: text-generation
+datasets:
+- the_pile
 ---
+# GGML covnerted Models of [EleutherAI](https://huggingface.co/EleutherAI)'s GPT-J model
 ## Description
+GPT-J 6B is a transformer model trained using Ben Wang's [Mesh Transformer JAX](https://github.com/kingoflolz/mesh-transformer-jax/). "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.
+<figure>
+| Hyperparameter       | Value      |
+|----------------------|------------|
+| \\(n_{parameters}\\) | 6053381344 |
+| \\(n_{layers}\\)     | 28&ast;    |
+| \\(d_{model}\\)      | 4096       |
+| \\(d_{ff}\\)         | 16384      |
+| \\(n_{heads}\\)      | 16         |
+| \\(d_{head}\\)       | 256        |
+| \\(n_{ctx}\\)        | 2048       |
+| \\(n_{vocab}\\)      | 50257/50400&dagger; (same tokenizer as GPT-2/3)  |
+| Positional Encoding  | [Rotary Position Embedding (RoPE)](https://arxiv.org/abs/2104.09864) |
+| RoPE Dimensions      | [64](https://github.com/kingoflolz/mesh-transformer-jax/blob/f2aa66e0925de6593dcbb70e72399b97b4130482/mesh_transformer/layers.py#L223) |
+<figcaption><p><strong>&ast;</strong> Each layer consists of one feedforward block and one self attention block.</p>
+<p><strong>&dagger;</strong> Although the embedding matrix has a size of 50400, only 50257 entries are used by the GPT-2 tokenizer.</p></figcaption></figure>
+The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model
+dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to 64
+dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of BPEs as
+GPT-2/GPT-3.
 ## Converted Models
 from llm_rs import AutoModel
 #Load the model, define any model you like from the list above as the `model_file`
+model = AutoModel.from_pretrained("rustformers/gpt-j-ggml",model_file="gpt-j-6b-q4_0-ggjt.bin")
 #Generate
 print(model.generate("The meaning of life is"))
 #### Run inference
 ```
+cargo run --release -- gptj infer -m path/to/model.bin  -p "Tell me how cool the Rust programming language is:"
 ```