GPT-1.5 (367M Parameters)

Model Summary

GPT-1.5 is a lightweight language model with 367M parameters, designed for fast and efficient text generation. It is optimized for chatbot applications, basic text completion, and general-purpose natural language processing (NLP) tasks.

Model Details

Model Name: GPT-1.5
Parameters: 367M
Architecture: Based on a modified GPT-style transformer with 5 attention heads and 50 layers.
Training: Fine-tuned using reinforcement learning with quality and speed-based rewards.
Quantization: Available in both full precision (FP32) and 4-bit quantized versions.
Primary Use Case: Chatbot applications and lightweight NLP tasks.

Training Data

GPT-1.5 was trained on a small but high-quality dataset that includes:

Basic greetings and conversational responses
Common knowledge-based answers
Simple reasoning and completion tasks

Intended Use

This model is intended for:

Chatbot applications
Text generation and autocompletion
Basic question-answering tasks

Limitations

Limited reasoning capabilities due to small parameter size.
Not suitable for complex NLP tasks requiring deep contextual understanding.
May generate inaccurate or biased responses depending on input prompts.

License

This model is released under an open license. Refer to the repository for details on usage and distribution rights.

How to Use

To load the model using Hugging Face Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("WolfInk/GPT-1.5")
tokenizer = AutoTokenizer.from_pretrained("WolfInk/GPT-1.5")

prompt = "Hello, how are you?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))

WolfInk
/

GPT-1.5