Edit model card

OpenAI ChatGPT-2

examples

Model description

Generative Pre-trained Transformer 2 (GPT-2), developed by OpenAI, represents the second iteration in their foundational series of GPT models. GPT-2 embarked on its journey with a substantial dataset comprising 8 million web pages. Initially unveiled in February 2019, it reached its pinnacle with the full release of the 1.5-billion-parameter model on November 5, 2019.

GPT-2 emerged as a direct evolution from its predecessor, GPT-1, boasting a tenfold augmentation in both parameter count and training dataset magnitude. Positioned as a versatile learner, its prowess across diverse tasks stemmed from its innate capacity to accurately prognosticate the subsequent item in a sequence. This predictive prowess endowed it with the capability to engage in text translation, answer inquiries derived from textual contexts, distill concise summaries from extensive passages, and produce text outputs rivalling human composition. Nonetheless, it occasionally exhibited tendencies towards repetitiveness or tangential incoherence, particularly when tasked with generating lengthy passages.

Architecturally akin to its antecedent GPT-1 and progeny GPT-3 and GPT-4, GPT-2 features a generative pre-trained transformer architecture, underpinned by a deep neural network framework, specifically a transformer model. Departing from antiquated recurrence- and convolution-based designs, this architecture capitalizes on attention mechanisms. These mechanisms afford the model the capability to selectively concentrate on segments of input text deemed most pertinent. This transformative architectural paradigm facilitates enhanced parallelization, markedly surpassing preceding benchmarks established by RNN/CNN/LSTM-based models.

Training

The transformer architecture provides a capability that allows GPT models to be trained on larger datasets compared to previous NLP (natural language processing) models. The GPT-1 model demonstrated the validity of this approach; however, GPT-2 aimed to further investigate the emergent properties of networks trained on extremely large datasets. CommonCrawl, a large corpus previously used to train NLP systems, was considered due to its extensive size. However, further examination revealed that much of the content was unintelligible. Consequently, OpenAI developed a new dataset called WebText. Instead of indiscriminately scraping content from the World Wide Web, WebText collected content only from pages linked to by Reddit posts that had received at least three upvotes prior to December 2017. The dataset was then cleaned; HTML documents were parsed into plain text, duplicate pages were removed, and Wikipedia pages were excluded due to the risk of overfitting, as they were prevalent in many other datasets. Additionally, this model was retrained using the OpenWebText corpus by Anezatra. Utilizing DistilGPT, the model was aimed at reducing its size to create a lighter and more efficient version. The DistilGPT technique maintains the model's learning capabilities while reducing the number of parameters, thus speeding up training and inference processes and utilizing resources more efficiently.

How to use


# pip install git+https://github.com/huggingface/transformers.git
# pip install accelerate
# pip install torch

from transformers import pipeline

text_generator = pipeline("text-generation", model="anezatra/chat-gpt2", tokenizer="anezatra/chat-gpt2")

prompt = "question: About psychologists?\nanswer:"

generated_text = text_generator(prompt, max_length=1000, num_return_sequences=1)

print(generated_text[0]["generated_text"])

Example Output

We can list what I have to say about psychologists as follows:

1) There is no direct correlation between age and behavior that goes beyond a single issue or point. This can make the difference that if you have a good therapist in there to help you develop a functioning and functioning mental health system, chances of going through these issues are very low.
2) No one can make this question unanswerable.
3) This is not the case.
4) People are asked "Which psychiatrist was best for ADHD?" and "Which way did your patient get it?" What advice for them? What advice they give you about psychotherapy therapy? How do they give you therapy? Which therapy you are going to get? And what advice do they give you?
5) The answer is "Yes." In fact, people will ask more than just "who was best for ADHD," the answer is "who did the best for ADHD." People respond almost as likely as other professionals who are more likely. The question to be asked "Is that a good way to help you better?" "Is it a good way to help you improve mental health in a non-psychiatric setting?" And what advice do clinicians give you about psychotherapy therapy?
6) Some therapists are skeptical. And as many as one third of people will tell you, "I have to tell you whether there's a medical professional you can help with when you look in the mirror" about all of these questions. And it's important to note that all of these individuals answer "yes" as many times as possible. There is really no way to test the reliability of these questions with accurate information or even have a clear objective answer that will answer all of these questions.
7) Some therapists are in denial about their own mental health problems. One of the reasons I am so critical of professional psychotherapy is to identify them as people who are going through a variety of mental health issues with different mental health problems. These people are often struggling with addiction and are sometimes in denial about what they have done and the way they have done and what they do. The same cannot be said about mental illness.
8) There is something wrong with talking about the individual for years.
9) If you say, "It is my responsibility to tell you. Do I want it as much as I can?" You may sound off on some of them, but do you know what can be done? Here are some helpful things:
1. The answer is "Don't talk to other people.

Authors

Downloads last month
14
Safetensors
Model size
81.9M params
Tensor type
F32
·

Dataset used to train anezatra/chat-gpt2