Instructions to use lorcannrauzduel/gpt2-citations with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use lorcannrauzduel/gpt2-citations with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="lorcannrauzduel/gpt2-citations")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("lorcannrauzduel/gpt2-citations")
model = AutoModelForCausalLM.from_pretrained("lorcannrauzduel/gpt2-citations")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use lorcannrauzduel/gpt2-citations with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "lorcannrauzduel/gpt2-citations"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lorcannrauzduel/gpt2-citations",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/lorcannrauzduel/gpt2-citations

SGLang

How to use lorcannrauzduel/gpt2-citations with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "lorcannrauzduel/gpt2-citations" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lorcannrauzduel/gpt2-citations",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "lorcannrauzduel/gpt2-citations" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lorcannrauzduel/gpt2-citations",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use lorcannrauzduel/gpt2-citations with Docker Model Runner:
```
docker model run hf.co/lorcannrauzduel/gpt2-citations
```

GPT‑2 Fine‑tuned on English Quotes

Model Description

This model is a fine‑tuned version of GPT‑2 small (124M parameters) on the Abirate/english_quotes dataset.
The goal is to generate text in the style of philosophical or literary quotes, including the author’s name.

⚠️ This model was created for educational and research purposes only. It is not intended for production use.
It demonstrates full fine‑tuning of a causal language model on a small dataset and the improvements in generation quality compared to the base model.

Base model: gpt2
Task: Causal language modelling (text generation)
Fine‑tuning type: Full fine‑tuning (all parameters updated)

Intended Uses & Limitations

Direct Use (Research / Experimentation)

You can use this model to generate short quotes given a prompt. The model expects prompts to start with the special token <|startoftext|> and will learn to produce a quote followed by an author and the <|endoftext|> token.

Example:

from transformers import pipeline

generator = pipeline("text-generation", model="lorcannrauzduel/gpt2-citations")
output = generator("<|startoftext|> The secret to", max_new_tokens=50, do_sample=True)
print(output[0]['generated_text'])

Limitations

The model is small (124M) and was trained on only ~2,500 quotes. It may sometimes produce repetitive or nonsensical outputs.
It only generates English text.
It does not have factual knowledge about the authors; it merely mimics the style of the training quotes.
Not suitable for any commercial or critical application.

Training Details

Training Data

Dataset: Abirate/english_quotes – 2,508 quotes, each with a quote and an author field.
Preprocessing: Each example was formatted as:
```
<|startoftext|> "quote" — author <|endoftext|>
```
The special tokens help the model learn where a quote starts and ends.

Training Procedure

The model was trained for 5 epochs using the Hugging Face Trainer with the following hyperparameters:

Hyperparameter	Value
Learning rate	5e-5
Batch size (per device)	8
Gradient accumulation	2
Effective batch size	16
Warmup steps	100
Weight decay	0.01
Optimizer	AdamW
Precision	fp16
Max sequence length	128
Training steps	1410

Hardware: NVIDIA Tesla T4 (15 GB VRAM) on Google Colab / Kaggle.
Training time: ~5 minutes.

Evaluation Results

The final training loss was 2.506, corresponding to a perplexity of 12.26.
Validation loss stagnated around 2.30, indicating a slight overfitting after 3‑4 epochs – acceptable for a small generative model.

How to Use the Model

With 🤗 Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("lorcannrauzduel/gpt2-citations")
model = AutoModelForCausalLM.from_pretrained("lorcannrauzduel/gpt2-citations")

prompt = "<|startoftext|> Life is"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=50, do_sample=True, temperature=0.9)
print(tokenizer.decode(output[0], skip_special_tokens=False))

With Pipeline

from transformers import pipeline

pipe = pipeline("text-generation", model="lorcannrauzduel/gpt2-citations")
print(pipe("<|startoftext|> You can never", max_new_tokens=50)[0]['generated_text'])

With vLLM (for high‑throughput inference)

pip install vllm
vllm serve "lorcannrauzduel/gpt2-citations"

Then query with curl:

curl -X POST "http://localhost:8000/v1/completions" \
     -H "Content-Type: application/json" \
     --data '{
         "model": "lorcannrauzduel/gpt2-citations",
         "prompt": "<|startoftext|> The secret to",
         "max_tokens": 50,
         "temperature": 0.8
     }'

With Ollama (local deployment after GGUF conversion)

Download the GGUF version from the repository (if available) or convert it yourself using llama.cpp.

Create a Modelfile:

FROM ./gpt2-citations-q4km.gguf
SYSTEM "You are a quote generator."
PARAMETER temperature 0.8
PARAMETER stop "<|endoftext|>"

Import and run:

ollama create gpt2-citations -f Modelfile
ollama run gpt2-citations "<|startoftext|> Life is"

Model Comparison (Base vs Fine‑tuned)

Prompt	GPT‑2 Base (no fine‑tuning)	GPT‑2 Fine‑tuned
`<	startoftext	> The secret to`
`<	startoftext	> Life is`
`<	startoftext	> You can never`

The fine‑tuned model consistently produces coherent quotes with an author attribution, while the base model generates irrelevant or repetitive text.

Environmental Impact

Training was performed on a cloud GPU (Tesla T4) for about 5 minutes. Estimated CO₂ emissions are negligible (< 0.01 kg CO₂eq).

Acknowledgements

The Hugging Face team for transformers and datasets.
The original GPT‑2 paper by Radford et al. (2019).
Dataset provided by Abirate.

License

This model is released under the MIT license (same as the original GPT‑2 small).

Model card created by lorcannrauzduel for research and experimentation purposes.

Downloads last month: 43

Safetensors

Model size

0.1B params

Tensor type

F32