crumb
/

distilpythia

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

crumb commited on May 4, 2023

Commit

eb0671f

·

1 Parent(s): c023524

Update README.md

Files changed (1) hide show

README.md +0 -2

README.md CHANGED Viewed

@@ -9,8 +9,6 @@ language:
 *by GPT-4 & Crumb*
-***Note***: *this model is in the process of being re-evaluated because it was retrained.*
 ### Introduction
 Transformer models have become a popular choice for natural language processing (NLP) tasks due to their ability to handle long-range dependencies and their superior performance on various NLP benchmarks. The transformer model architecture was introduced in 2017 by [Vaswani et al](https://arxiv.org/abs/1706.03762). and has since been used in many state-of-the-art models such as BERT and GPT. The decoder-only transformer model is a variant of the transformer model that has is commonly used for generative tasks in NLP.  It uses masked self-attention to predict the next token in a sequence and has been shown to be powerful at predicting sequences of text.

 *by GPT-4 & Crumb*
 ### Introduction
 Transformer models have become a popular choice for natural language processing (NLP) tasks due to their ability to handle long-range dependencies and their superior performance on various NLP benchmarks. The transformer model architecture was introduced in 2017 by [Vaswani et al](https://arxiv.org/abs/1706.03762). and has since been used in many state-of-the-art models such as BERT and GPT. The decoder-only transformer model is a variant of the transformer model that has is commonly used for generative tasks in NLP.  It uses masked self-attention to predict the next token in a sequence and has been shown to be powerful at predicting sequences of text.