nthngdy commited on
Commit
aef9bec
1 Parent(s): 0c49c1f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - the_pile_openwebtext2
5
+ language:
6
+ - en
7
+ pipeline_tag: token-classification
8
+ ---
9
+
10
+ ### Model Sources
11
+
12
+ <!-- Provide the basic links for the model. -->
13
+
14
+ - **Repository:** https://github.com/NathanGodey/headless-lm
15
+ - **Paper:** https://arxiv.org/abs/2309.08351
16
+
17
+
18
+ ### Model Architecture and Objective
19
+
20
+ This model is a Pythia-70m architecture trained on OpenWebText-2 using the Contrastive Weight Tying objective, and briefly fine-tuned for language generation on the same dataset.
21
+
22
+ ## Citation
23
+
24
+ **BibTeX:**
25
+
26
+ ```bibtex
27
+ @misc{godey2023headless,
28
+ title={Headless Language Models: Learning without Predicting with Contrastive Weight Tying},
29
+ author={Nathan Godey and Éric de la Clergerie and Benoît Sagot},
30
+ year={2023},
31
+ eprint={2309.08351},
32
+ archivePrefix={arXiv},
33
+ primaryClass={cs.CL}
34
+ }
35
+ ```
36
+
37
+
38
+ ## Contact
39
+
40
+ nathan.godey@inria.fr
41
+