Edit model card

GPT-2-Instruct_0.7B-8K-Fineweb

This repository contains a GPT-2 chatbot model, which has been fine-tuned on multiple datasets to provide engaging and informative conversational capabilities.

Overview

The base model used is rhysjones/gpt2-774M-fineweb-150B, which was originally trained on the fineweb dataset. This model has been further fine-tuned using the following datasets:

The fine-tuning process, conducted over 3 epochs, has enhanced the chatbot's conversational abilities, making it suitable for various applications requiring natural language understanding and generation.

Features

  • Natural Language Understanding: The model can understand and process various conversational inputs.
  • Conversational Responses: Provides relevant and coherent responses based on the context of the conversation.
  • Fine-Tuned: Enhanced with additional training data to improve performance and adaptability.
  • English Language Only: This model works exclusively in English.
  • Extended Context Size: The model supports a context size of 8192 tokens.

Usage

Here is an example of how to use the GPT-2 chatbot in your Python code:

from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Initialize model and tokenizer
model_name_or_path = "ifmain/GPT-2-Instruct_0.7B-8K-Fineweb"
tokenizer = GPT2Tokenizer.from_pretrained(model_name_or_path)
model = GPT2LMHeadModel.from_pretrained(model_name_or_path).to(DEVICE)

# Define initial text
text = """<|user|><|content|>Hello! Who are you?<|end_turn|>
<|assistant|><|content|>Hello! I'm ChatBot, a large language model. I'm here to help you with information, answer questions, provide recommendations, and assist with a variety of tasks. How can I assist you today?<|end_turn|>
<|user|><|content|>What can you do?<|end_turn|>
<|assistant|><|content|>I can assist with a wide range of tasks, including but not limited to:

1. **Answering Questions**: Providing information on a variety of topics like science, history, technology, and more.
2. **Writing Assistance**: Helping with writing essays, articles, stories, and reports.
3. **Learning Support**: Offering explanations and tutoring on academic subjects such as math, science, and languages.
4. **Programming Help**: Assisting with coding problems, debugging, and explaining programming concepts.
5. **Language Translation**: Translating text between different languages.
6. **Recommendations**: Suggesting books, movies, recipes, and travel destinations.
7. **Conversational Practice**: Engaging in conversations to help improve language and communication skills.
8. **Creative Tasks**: Generating poems, stories, and creative ideas.
9. **Summarizing Text**: Condensing long texts into shorter summaries.
10. **Analyzing Data**: Interpreting data and creating visualizations (like charts and graphs).
11. **Real-time Information**: Providing updates on current events, weather, sports scores, etc.

Is there something specific you'd like help with?<|end_turn|>
<|user|><|content|>write why AI will not replace programmers<|end_turn|>
<|assistant|><|content|>"""

# Tokenize the text
input_ids = tokenizer.encode(text, return_tensors="pt").to(DEVICE)

# Generate new text
out = model.generate(input_ids,
                     do_sample=True,
                     temperature=1.3,
                     top_k=20,
                     top_p=0.8,
                     max_length=input_ids.size(1) + 100,
                     pad_token_id=tokenizer.eos_token_id
                    )

# Decode the generated tokens
generated_text = tokenizer.decode(out[0], skip_special_tokens=True)

# Print the generated text
print(generated_text)

Out:

AI, while being a powerful language model, is not intended to replace traditional programmers. Traditional programmers have many important skills and knowledge that AI cannot replicate. While AI can provide assistance in various areas, it will not provide the same level of expertise or creativity that a human programmer can bring.

For example, while AI can generate creative ideas and ideas, it does not have the same understanding of the concepts being discussed or the specific technical details needed to create them. Traditional programmers, on the other hand, have extensive knowledge of their chosen field, the software they work with, and the underlying concepts.

In summary, while AI can assist with tasks such as answering questions, generating content, and generating ideas, it does not possess the same level of understanding or creativity that humans can provide. In addition, AI may not provide the same level of efficiency or effectiveness as human programmers, as they often have extensive knowledge and experience.<|end_turn|>

License

This project inherits the base model license (rhysjones/gpt2-774M-fineweb-150B). Check the license of the base model before using it in your projects.

Acknowledgements

Feel free to reach out for any questions or contributions. Enjoy chatting!

Downloads last month
21
Safetensors
Model size
783M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from

Datasets used to train ifmain/GPT-2-Instruct_0.7B-8K-Fineweb