Text Generation
Transformers
English
gpt_neox
red_pajama
text-generation-inference
Inference Endpoints
Edit model card

Original Model Link: https://huggingface.co/togethercomputer/RedPajama-INCITE-Chat-3B-v1

This will NOT work with llama.cpp as of 5/13/2023, but this NOW works (5/13/2023) with the GGML in https://github.com/ggerganov/ggml/ via gpt-neox This also works in my project https://github.com/keldenl/gpt-llama.cpp (uses ggml as an InferenceEngine).

RedPajama-INCITE-Chat-3B-v1

RedPajama-INCITE-Chat-3B-v1 was developed by Together and leaders from the open-source AI community including Ontocord.ai, ETH DS3Lab, AAI CERC, Université de Montréal, MILA - Québec AI Institute, Stanford Center for Research on Foundation Models (CRFM), Stanford Hazy Research research group and LAION.

It is fine-tuned on OASST1 and Dolly2 to enhance chatting ability.

Model Details

  • Developed by: Together Computer.
  • Model type: Language Model
  • Language(s): English
  • License: Apache 2.0
  • Model Description: A 2.8B parameter pretrained language model.

Prompt Template

To prompt the chat model, use the following format:

<human>: [Instruction]
<bot>:

Which model to download?

  • The q4_0 file provides lower quality, but maximal compatibility. It will work with past and future versions of llama.cpp
  • The q4_2 file offers the best combination of performance and quality. This format is still subject to change and there may be compatibility issues, see below.
  • The q5_0 file is using brand new 5bit method released 26th April. This is the 5bit equivalent of q4_0.
  • The q5_1 file is using brand new 5bit method released 26th April. This is the 5bit equivalent of q4_1.
Downloads last month
22
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train keldenl/RedPajama-INCITE-Chat-3B-v1-GGML