GPT-1.5 (367M Parameters)

Model Summary

GPT-1.5 is a lightweight language model with 367M parameters, designed for fast and efficient text generation. It is optimized for chatbot applications, basic text completion, and general-purpose natural language processing (NLP) tasks.

Model Details

  • Model Name: GPT-1.5
  • Parameters: 367M
  • Architecture: Based on a modified GPT-style transformer with 5 attention heads and 50 layers.
  • Training: Fine-tuned using reinforcement learning with quality and speed-based rewards.
  • Quantization: Available in both full precision (FP32) and 4-bit quantized versions.
  • Primary Use Case: Chatbot applications and lightweight NLP tasks.

Training Data

GPT-1.5 was trained on a small but high-quality dataset that includes:

  • Basic greetings and conversational responses
  • Common knowledge-based answers
  • Simple reasoning and completion tasks

Intended Use

This model is intended for:

  • Chatbot applications
  • Text generation and autocompletion
  • Basic question-answering tasks

Limitations

  • Limited reasoning capabilities due to small parameter size.
  • Not suitable for complex NLP tasks requiring deep contextual understanding.
  • May generate inaccurate or biased responses depending on input prompts.

License

This model is released under an open license. Refer to the repository for details on usage and distribution rights.

How to Use

To load the model using Hugging Face Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("WolfInk/GPT-1.5")
tokenizer = AutoTokenizer.from_pretrained("WolfInk/GPT-1.5")

prompt = "Hello, how are you?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
Downloads last month
16
Safetensors
Model size
367M params
Tensor type
F32
·
U8
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for WolfInk/GPT-1.5

Quantized
(67)
this model

Dataset used to train WolfInk/GPT-1.5