crumb commited on
Commit
eb0671f
1 Parent(s): c023524

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -2
README.md CHANGED
@@ -9,8 +9,6 @@ language:
9
 
10
  *by GPT-4 & Crumb*
11
 
12
- ***Note***: *this model is in the process of being re-evaluated because it was retrained.*
13
-
14
  ### Introduction
15
 
16
  Transformer models have become a popular choice for natural language processing (NLP) tasks due to their ability to handle long-range dependencies and their superior performance on various NLP benchmarks. The transformer model architecture was introduced in 2017 by [Vaswani et al](https://arxiv.org/abs/1706.03762). and has since been used in many state-of-the-art models such as BERT and GPT. The decoder-only transformer model is a variant of the transformer model that has is commonly used for generative tasks in NLP. It uses masked self-attention to predict the next token in a sequence and has been shown to be powerful at predicting sequences of text.
 
9
 
10
  *by GPT-4 & Crumb*
11
 
 
 
12
  ### Introduction
13
 
14
  Transformer models have become a popular choice for natural language processing (NLP) tasks due to their ability to handle long-range dependencies and their superior performance on various NLP benchmarks. The transformer model architecture was introduced in 2017 by [Vaswani et al](https://arxiv.org/abs/1706.03762). and has since been used in many state-of-the-art models such as BERT and GPT. The decoder-only transformer model is a variant of the transformer model that has is commonly used for generative tasks in NLP. It uses masked self-attention to predict the next token in a sequence and has been shown to be powerful at predicting sequences of text.