jarodrigues commited on
Commit
9ab78df
·
verified ·
1 Parent(s): 062069e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -22,6 +22,7 @@ tags:
22
  - foundation model
23
  datasets:
24
  - PORTULAN/glue-ptpt
 
25
  ---
26
  </br>
27
  </br>
@@ -82,7 +83,7 @@ Gervásio-7B-PTPT-Decoder is distributed under an [MIT license](https://huggingf
82
 
83
  # Training Data
84
 
85
- **Gervásio 7B PT-PT** over standard supervised fine-tuning, and to keep some alignment with mainstream benchmarks for English, we resorted to tasks and respective datasets in the GLUE and the SuperGLUE collections.
86
 
87
 
88
  We selected those datasets where the outcome of their machine translation into Portuguese could preserve, in the target language, the linguistic properties at stake.
@@ -102,11 +103,11 @@ And from SuperGLUE, we included these other four tasks:
102
 
103
  Instruction templates have been manually crafted for each task.
104
  These take the various fields in the dataset and arrange them into a prompt.
105
- These templates are listed in full detail in TODO.
106
 
107
  # Training Details
108
 
109
- We applied supervised fine-tuning with causal language modeling (CLM) training objective with a zero-out technique during the fine-tuning process.
110
  Specifically, while the entire prompt received attention during fine-tuning, only the response tokens were subjected to back-propagation.
111
 
112
  In terms of hyper-parameters, both models were trained with a learning rate of 2 * 10^-5, a weight decay of 0.1, a two-epoch training regime without warm-up, and to ensure the same number of tokens back-propagated per step, we employed an input sequence of 512 tokens with a batch size of 16 and 16 accumulation steps.
@@ -139,7 +140,7 @@ You can use this model directly with a pipeline for causal language modeling (CL
139
 
140
  ```python3
141
  >>> from transformers import pipeline
142
- >>> generator = pipeline(model='PORTULAN/gervasio-ptpt-decoder')
143
  >>> generator("A música portuguesa é", max_new_tokens=10)
144
  [{'generated_text': 'A música portuguesa é uma das mais ricas do mundo'}]
145
 
@@ -156,4 +157,4 @@ grant PINFRA/22117/2016; research project GPT-PT - Transformer-based Decoder for
156
  grant CPCA-IAC/AV/478395/2022; innovation project
157
  ACCELERAT.AI - Multilingual Intelligent Contact Centers, funded by IAPMEI, I.P. - Agência para a Competitividade e Inovação
158
  under the grant C625734525-00462629, of Plano de Recuperação e Resiliência,
159
- call RE-C05-i01.01 – Agendas/Alianças Mobilizadoras para a Reindustrialização.
 
22
  - foundation model
23
  datasets:
24
  - PORTULAN/glue-ptpt
25
+ - PORTULAN/extraglue
26
  ---
27
  </br>
28
  </br>
 
83
 
84
  # Training Data
85
 
86
+ **Gervásio 7B PT-PT** was trained over standard supervised fine-tuning, and to keep some alignment with mainstream benchmarks for English, we resorted to tasks and respective datasets in the GLUE and the SuperGLUE collections.
87
 
88
 
89
  We selected those datasets where the outcome of their machine translation into Portuguese could preserve, in the target language, the linguistic properties at stake.
 
103
 
104
  Instruction templates have been manually crafted for each task.
105
  These take the various fields in the dataset and arrange them into a prompt.
106
+ These templates are listed in full detail in the [Extraglue dataset](https://huggingface.co/datasets/PORTULAN/extraglue).
107
 
108
  # Training Details
109
 
110
+ We applied supervised fine-tuning with a causal language modeling (CLM) training objective following a zero-out technique during the fine-tuning process.
111
  Specifically, while the entire prompt received attention during fine-tuning, only the response tokens were subjected to back-propagation.
112
 
113
  In terms of hyper-parameters, both models were trained with a learning rate of 2 * 10^-5, a weight decay of 0.1, a two-epoch training regime without warm-up, and to ensure the same number of tokens back-propagated per step, we employed an input sequence of 512 tokens with a batch size of 16 and 16 accumulation steps.
 
140
 
141
  ```python3
142
  >>> from transformers import pipeline
143
+ >>> generator = pipeline(model='PORTULAN/gervasio-7b-portuguese-ptpt-decoder')
144
  >>> generator("A música portuguesa é", max_new_tokens=10)
145
  [{'generated_text': 'A música portuguesa é uma das mais ricas do mundo'}]
146
 
 
157
  grant CPCA-IAC/AV/478395/2022; innovation project
158
  ACCELERAT.AI - Multilingual Intelligent Contact Centers, funded by IAPMEI, I.P. - Agência para a Competitividade e Inovação
159
  under the grant C625734525-00462629, of Plano de Recuperação e Resiliência,
160
+ call RE-C05-i01.01 – Agendas/Alianças Mobilizadoras para a Reindustrialização.