wissamantoun commited on
Commit
f216932
1 Parent(s): 5e06ef0

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. Readme.md +101 -0
  2. config.json +32 -0
  3. pytorch_model.bin +3 -0
  4. tokenizer.json +0 -0
Readme.md ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - ccnet-fr
5
+ language:
6
+ - fr
7
+ tags:
8
+ - pagnol
9
+ ---
10
+
11
+ # PAGnol: An Extra-Large French Generative Model
12
+
13
+ Paper: [ARXIV](https://arxiv.org/abs/2110.08554), [ACL ANTHOLOGY](https://aclanthology.org/2022.lrec-1.455/)
14
+
15
+ Code: [GITHUB](https://github.com/lightonai/lairgpt)
16
+
17
+ PAGnol is a collection of large French language models, geared towards free-form text generation. With 1.5 billion parameters. PAGnol is based on the [GPT](https://arxiv.org/abs/2005.14165) architecture. PAGnol is the first language model trained by [LightOn](https://lighton.ai/), in cooperation with the [ALMAnaCH team of Inria](http://almanach.inria.fr/index-en.html).
18
+
19
+ These model were trained in early 2021 following the then [scaling laws](https://arxiv.org/abs/2001.08361) and using the exact same training data as the [CamemBERT](https://camembert-model.fr/) model trained on [CCNet](https://github.com/facebookresearch/cc_net). We make it available for reproducibility and transparency purposes.
20
+ They do not constitute the current state of the art nor are they aiming at it.
21
+
22
+ PAGnol was built by [Julien Launay](https://lolo.science/), E.L. Tommasone, [Baptiste Pannier](https://www.linkedin.com/in/baptiste-pannier-b30758154/), [François Boniface](https://www.linkedin.com/in/fran%c3%a7ois-boniface-26313610b/), [Amélie Chatelain](https://www.instagram.com/amelietabatta/), [Iacopo Poli](https://twitter.com/iacopo_poli), and [Djamé Seddah](http://pauillac.inria.fr/~seddah/). It is named after Marcel Pagnol (with PAG standing for pré-apprentissage génératif), and was trained on the IDRIS Jean Zay supercomputer thanks to a GENCI allocation.
23
+
24
+ The model was converted to the Hugging Face format by [Wissam Antoun](https://wissamantoun.com) ([ALMAnaCH](http://almanach.inria.fr/index-en.html)'s PhD student, co-supervised by [Benoît Sagot](https://pauillac.inria.fr/~sagot/) and [Djamé Seddah](http://pauillac.inria.fr/~seddah/))
25
+
26
+ # Usage
27
+
28
+ ### Using PAGnol with Huggingface
29
+ ```python
30
+ from transformers import pipeline
31
+
32
+ generator = pipeline('text-generation', model='lightonai/pagnol-small', trust_remote_code=True)
33
+
34
+ output = generator(
35
+ "Salut PAGnol, comment ça va ?",
36
+ max_length=50,
37
+ do_sample=True,
38
+ temperature=0.7,
39
+ )[0]["generated_text"]
40
+
41
+ >>> "Très bien! Les jours d’été sont là ! Bientôt les premiers festivals..."
42
+ ```
43
+
44
+ ### Using PAGnol with lairgpt
45
+ Head over to our [GitHub repository](https://github.com/lightonai/lairgpt) to access our PyTorch inference code.
46
+ Using PAGnol is as simple as running the following code:
47
+ ```python
48
+ from lairgpt.models import PAGnol
49
+
50
+ pagnol = PAGnol.small()
51
+ pagnol("Salut PAGnol, comment ça va ?")
52
+
53
+ >>> "Très bien! Les jours d’été sont là ! Bientôt les premiers festivals..."
54
+ ```
55
+
56
+ # License
57
+ PAGnol is made available under the MIT licence: by downloading the models available below, you agree with the terms of the MIT licence agreement. Under no circumstances will LightOn and/or Inria be held responsible or liable in any way for any claims, damages, losses, expenses, costs or liabilities whatsoever (including, without limitation, any direct or indirect damages for loss of profits, business interruption or loss of information) resulting or arising directly or indirectly from your use of or inability to use PAGnol.
58
+
59
+ # Available Models
60
+ - [`lightonai/pagnol-small`](https://huggingface.co/lightonai/pagnol-small): 125M parameters
61
+ - [`lightonai/pagnol-medium`](https://huggingface.co/lightonai/pagnol-medium): 355M parameters
62
+ - [`lightonai/pagnol-large`](https://huggingface.co/lightonai/pagnol-large): 773M parameters
63
+ - [`lightonai/pagnol-xl`](https://huggingface.co/lightonai/pagnol-xl): 1.5B parameters
64
+
65
+ # Citation
66
+ ```
67
+ @inproceedings{launay-etal-2022-pagnol,
68
+ title = "{PAG}nol: An Extra-Large {F}rench Generative Model",
69
+ author = "Launay, Julien and
70
+ Tommasone, E.l. and
71
+ Pannier, Baptiste and
72
+ Boniface, Fran{\c{c}}ois and
73
+ Chatelain, Am{\'e}lie and
74
+ Cappelli, Alessandro and
75
+ Poli, Iacopo and
76
+ Seddah, Djam{\'e}",
77
+ editor = "Calzolari, Nicoletta and
78
+ B{\'e}chet, Fr{\'e}d{\'e}ric and
79
+ Blache, Philippe and
80
+ Choukri, Khalid and
81
+ Cieri, Christopher and
82
+ Declerck, Thierry and
83
+ Goggi, Sara and
84
+ Isahara, Hitoshi and
85
+ Maegaard, Bente and
86
+ Mariani, Joseph and
87
+ Mazo, H{\'e}l{\`e}ne and
88
+ Odijk, Jan and
89
+ Piperidis, Stelios",
90
+ booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
91
+ month = jun,
92
+ year = "2022",
93
+ address = "Marseille, France",
94
+ publisher = "European Language Resources Association",
95
+ url = "https://aclanthology.org/2022.lrec-1.455",
96
+ pages = "4275--4284",
97
+ }
98
+ ```
99
+ # Contact
100
+ For research enquiries: pagnol@lighton.ai
101
+ For business enquiries: customer.relations@lighton.ai
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "activation_function": "gelu",
3
+ "add_prefix_space": true,
4
+ "architectures": [
5
+ "GPT2Model"
6
+ ],
7
+ "attn_pdrop": 0.1,
8
+ "bos_token_id": 1,
9
+ "embd_pdrop": 0.1,
10
+ "eos_token_id": 1,
11
+ "initializer_range": 0.02,
12
+ "layer_norm_epsilon": 1e-05,
13
+ "model_type": "gpt2",
14
+ "n_embd": 768,
15
+ "n_head": 12,
16
+ "n_inner": null,
17
+ "n_layer": 12,
18
+ "n_positions": 1024,
19
+ "reorder_and_upcast_attn": false,
20
+ "resid_pdrop": 0.1,
21
+ "scale_attn_by_inverse_layer_idx": false,
22
+ "scale_attn_weights": true,
23
+ "summary_activation": null,
24
+ "summary_first_dropout": 0.1,
25
+ "summary_proj_to_labels": true,
26
+ "summary_type": "cls_index",
27
+ "summary_use_proj": true,
28
+ "torch_dtype": "float32",
29
+ "transformers_version": "4.26.0",
30
+ "use_cache": true,
31
+ "vocab_size": 50262
32
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:522744484cd222a0a30c2c2b383c5d64efe7b2b76522327d8e6341d220c8c45e
3
+ size 510407273
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff