Model Card for [Your Hugging Face Username]/[Your Model Repository Name]

This is a causal language model fine-tuned from [BASE_MODEL_NAME] on comments collected from the r/[TARGET_SUBREDDIT] subreddit. It's intended to generate conversational text mimicking the style and topics found in that community.

Model Details

Model Description

This model is a fine-tuned version of the [BASE_MODEL_NAME] transformer model. It was trained on a dataset of comments fetched from the r/[TARGET_SUBREDDIT] subreddit using the PRAW library. The goal was to adapt the base model to generate responses in a style characteristic of conversations within that specific online community.

  • Developed by: [Your Name or Hugging Face Username] (Based on the provided fine-tuning script)
  • Funded by [optional]: [Personal Project / Self-funded / Your Funding Source]
  • Shared by [optional]: [Your Name or Hugging Face Username]
  • Model type: Causal Language Model (Decoder-only Transformer)
  • Language(s) (NLP): Primarily English (en). The dataset sourced from Reddit may contain other languages or slang specific to the community.
  • License: The license for this model is based on the license of the original [BASE_MODEL_NAME] model: [Link to Base Model License]. Note that the training data comes from Reddit and is subject to Reddit's User Agreement and Content Policy. Users must comply with Reddit's terms when using this model or the data.
  • Finetuned from model: [BASE_MODEL_NAME] (e.g., microsoft/DialoGPT-medium or gpt2)

Model Sources [optional]

  • Repository: https://huggingface.co/[Your Hugging Face Username]/[Your Model Repository Name]
  • Paper [optional]: [Link to base model's paper, e.g., DialoGPT paper, if applicable]
  • Demo [optional]: [Link to a demo if you create one]

Uses

Direct Use

This model is intended for generating conversational text, simulating responses one might find in the r/[TARGET_SUBREDDIT] subreddit. It can be used directly with the transformers library pipeline for text generation or through manual generation loops for more control.

from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
import torch

# Using pipeline (simple)
pipe = pipeline("text-generation", model="[Your Hugging Face Username]/[Your Model Repository Name]", device=0 if torch.cuda.is_available() else -1)
prompt = "What are your thoughts on " # Example prompt
response = pipe(prompt, max_new_tokens=50, num_return_sequences=1)
print(response[0]['generated_text'])

# Manual usage (more control, similar to script's chat)
tokenizer = AutoTokenizer.from_pretrained("[Your Hugging Face Username]/[Your Model Repository Name]")
model = AutoModelForCausalLM.from_pretrained("[Your Hugging Face Username]/[Your Model Repository Name]")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

prompt = "The best thing about [topic relevant to subreddit] is "
inputs = tokenizer.encode(prompt + tokenizer.eos_token, return_tensors='pt').to(device)

# Example generation parameters (adjust as needed)
outputs = model.generate(
    inputs,
    max_new_tokens=100,
    do_sample=True,
    top_k=50,
    top_p=0.92,
    temperature=0.75,
    pad_token_id=tokenizer.eos_token_id
)

response_text = tokenizer.decode(outputs[0, inputs.shape[-1]:], skip_special_tokens=True)
print(f"Prompt: {prompt}")
print(f"Bot: {response_text}")
Downloads last month
4
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rjx76/reddit-chatbot-model

Quantizations
1 model