Loubna ben allal
commited on
Commit
•
d97087d
1
Parent(s):
210cf5b
add architecture info
Browse files
architectures/.ipynb_checkpoints/codeparrot-checkpoint.txt
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[CodeParrot](https://huggingface.co/lvwerra/codeparrot) uses GPT-2 architecture with BPE tokenizer trained on Python code.
|
2 |
+
|
3 |
+
|Model | # parameters |
|
4 |
+
| - | - |
|
5 |
+
| GPT2 | 1.5B |
|
architectures/.ipynb_checkpoints/incoder-checkpoint.txt
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[InCoder](https://huggingface.co/facebook/incoder-6B) uses a decoder-only Transformer with [Causal Masking objective](https://arxiv.org/abs/2201.07520), to train a left-to-right language model to fill in masked token segments.
|
2 |
+
|
3 |
+
|Model | # parameters |
|
4 |
+
| - | - |
|
5 |
+
| Decoder |6.7B |
|
architectures/.ipynb_checkpoints/opt-checkpoint.txt
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[OPT](https://huggingface.co/facebook/opt-30b) uses decoder-only models like GPT-3. It was trained on datasets with a small portion of code. In this demo we use the 30B parameters model. The largest model has 176B parameters.
|
2 |
+
|
3 |
+
|Model | # parameters |
|
4 |
+
| - | - |
|
5 |
+
| Decoder |30B |
|
architectures/codeparrot.txt
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[CodeParrot](https://huggingface.co/lvwerra/codeparrot) uses GPT-2 architecture with BPE tokenizer trained on Python code.
|
2 |
+
|
3 |
+
|Model | # parameters |
|
4 |
+
| - | - |
|
5 |
+
| GPT2 | 1.5B |
|
architectures/incoder.txt
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[InCoder](https://huggingface.co/facebook/incoder-6B) uses a decoder-only Transformer with [Causal Masking objective](https://arxiv.org/abs/2201.07520), to train a left-to-right language model to fill in masked token segments.
|
2 |
+
|
3 |
+
|Model | # parameters |
|
4 |
+
| - | - |
|
5 |
+
| Decoder |6.7B |
|
architectures/opt.txt
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[OPT](https://huggingface.co/facebook/opt-30b) uses decoder-only models like GPT-3. It was trained on datasets with a small portion of code. In this demo we use the 30B parameters model. The largest model has 176B parameters.
|
2 |
+
|
3 |
+
|Model | # parameters |
|
4 |
+
| - | - |
|
5 |
+
| Decoder |30B |
|