readme file
Browse files
README.md
ADDED
@@ 0,0 +1,64 @@



































































































































1 
+


2 
+
language: en

3 
+
tags:

4 
+
 robertabase

5 
+
 robertabaseepoch_15

6 
+
license: mit

7 
+
datasets:

8 
+
 wikipedia

9 
+
 bookcorpus

10 
+


11 
+

12 
+
# RoBERTa, Intermediate Checkpoint  Epoch 15

13 
+

14 
+
This model is part of our reimplementation of the [RoBERTa model](https://arxiv.org/abs/1907.11692),

15 
+
trained on Wikipedia and the Book Corpus only.

16 
+
We train this model for almost 100K steps, corresponding to 83 epochs.

17 
+
We provide the 84 checkpoints (including the randomly initialized weights before the training)

18 
+
to provide the ability to study the training dynamics of such models, and other possible usecases.

19 
+

20 
+
These models were trained in part of a work that studies how simple statistics from data,

21 
+
such as cooccurrences affects model predictions, which are described in the paper

22 
+
[Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions](https://arxiv.org/abs/2207.14251).

23 
+

24 
+
This is RoBERTabase epoch_15.

25 
+

26 
+
## Model Description

27 
+

28 
+
This model was captured during a reproduction of

29 
+
[RoBERTabase](https://huggingface.co/robertabase), for English: it

30 
+
is a Transformers model pretrained on a large corpus of English data, using the

31 
+
Masked Language Modelling (MLM).

32 
+

33 
+
The intended uses, limitations, training data and training procedure for the fully trained model are similar

34 
+
to [RoBERTabase](https://huggingface.co/robertabase). Two major

35 
+
differences with the original model:

36 
+

37 
+
* We trained our model for 100K steps, instead of 500K

38 
+
* We only use Wikipedia and the Book Corpus, as corpora which are publicly available.

39 
+

40 
+

41 
+
### How to use

42 
+

43 
+
Using code from

44 
+
[RoBERTabase](https://huggingface.co/robertabase), here is an example based on

45 
+
PyTorch:

46 
+

47 
+
```

48 
+
from transformers import pipeline

49 
+

50 
+
model = pipeline("fillmask", model='yanaiela/robertabaseepoch_83', device=1, top_k=10)

51 
+
model("Hello, I'm the <mask> RoBERTabase language model")

52 
+

53 
+
```

54 
+

55 
+
## Citation info

56 
+

57 
+
```bibtex

58 
+
@article{2207.14251,

59 
+
Author = {Yanai Elazar and Nora Kassner and Shauli Ravfogel and Amir Feder and Abhilasha Ravichander and Marius Mosbach and Yonatan Belinkov and Hinrich Schütze and Yoav Goldberg},

60 
+
Title = {Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions},

61 
+
Year = {2022},

62 
+
Eprint = {arXiv:2207.14251},

63 
+
}

64 
+
```
