theblackcat102 commited on
Commit
9874e5b
1 Parent(s): da9b195

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -2,7 +2,7 @@
2
  license: mit
3
  ---
4
 
5
- A asymmetric version of [bigscience's mt0-xl model](https://huggingface.co/bigscience/mt0-xl), which trim down like [Meena chatbot](https://arxiv.org/pdf/2001.09977.pdf) from Google. An interesting aspect of Meena is that they use a small encoder, big decoder architecture.
6
 
7
  > Meena has a single Evolved Transformer encoder block and 13 Evolved Transformer decoder blocks, as illustrated below. The encoder is responsible for processing the conversation context to help Meena understand what has already been said in the conversation. The decoder then uses that information to formulate an actual response. Through tuning the hyper-parameters, we discovered that a more powerful decoder was the key to higher conversational quality.
8
 
 
2
  license: mit
3
  ---
4
 
5
+ Asymmetric version of [bigscience's mt0-xl model](https://huggingface.co/bigscience/mt0-xl), which trim down like [Meena chatbot](https://arxiv.org/pdf/2001.09977.pdf) from Google. An interesting aspect of Meena is that they use a small encoder, big decoder architecture.
6
 
7
  > Meena has a single Evolved Transformer encoder block and 13 Evolved Transformer decoder blocks, as illustrated below. The encoder is responsible for processing the conversation context to help Meena understand what has already been said in the conversation. The decoder then uses that information to formulate an actual response. Through tuning the hyper-parameters, we discovered that a more powerful decoder was the key to higher conversational quality.
8