KoalaAI
/

OPT-1.3b-Chat

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

DarwinAnim8or commited on Sep 11, 2023

Commit

cb265aa

·

1 Parent(s): 2eec48f

Update README.md

Files changed (1) hide show

README.md +11 -1

README.md CHANGED Viewed

@@ -5,11 +5,21 @@ license: other
 This is a text generation model based on the [OPT-1.3B](https://huggingface.co/facebook/opt-1.3b) model from Meta, trained using the Deepspeed library. The model can generate natural and engaging conversational responses given a user input.
-## Model Details
 - The base model is [OPT-1.3B](https://huggingface.co/facebook/opt-1.3b), a decoder-only transformer with 1.3 billion parameters, pre-trained on a large text corpus using the causal language modeling objective.
 - The model was trained on a single NVIDIA A100 GPU using the Deepspeed pipeline parallelism and ZeRO optimizer.
 ## Usage
 You can use this model directly with the Hugging Face pipeline for text generation:

 This is a text generation model based on the [OPT-1.3B](https://huggingface.co/facebook/opt-1.3b) model from Meta, trained using the Deepspeed library. The model can generate natural and engaging conversational responses given a user input.
+## Training Details
 - The base model is [OPT-1.3B](https://huggingface.co/facebook/opt-1.3b), a decoder-only transformer with 1.3 billion parameters, pre-trained on a large text corpus using the causal language modeling objective.
 - The model was trained on a single NVIDIA A100 GPU using the Deepspeed pipeline parallelism and ZeRO optimizer.
+## Model Details
+- Number of parameters: 1.3 billion
+- Number of layers: 24
+- Number of attention heads: 16
+- Context size: 2048
+- Vocabulary size: 50,265
+- Embedding size: 1280
+- Feed-forward size: 5120
+- Dropout rate: 0.1
 ## Usage
 You can use this model directly with the Hugging Face pipeline for text generation: