rdemorais commited on
Commit
925e81e
1 Parent(s): 6e6a7a7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +100 -0
README.md ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - thegoodfellas/mc4-pt-cleaned
5
+ language:
6
+ - pt
7
+ inference: false
8
+ ---
9
+
10
+ # Model Card for Model ID
11
+
12
+ <!-- Provide a quick summary of what the model is/does. -->
13
+
14
+ This is the PT-BR Flan-T5-base model.
15
+
16
+ # Model Details
17
+
18
+ ## Model Description
19
+
20
+ This model was created to act as the base study for researchs who wants to learn how the Flan-T5 works. This is the Portuguese version.
21
+
22
+ - **Developed by:** The Good Fellas team
23
+ - **Model type:** Flan-T5
24
+ - **Language(s) (NLP):** Portuguese (BR)
25
+ - **License:** apache-2.0
26
+ - **Finetuned from model [optional]:** Flan-T5-base
27
+
28
+ We would like to thanks the TPU Research Cloud team for that amazing opportunity given to us. To learn about TRC: https://sites.research.google/trc/about/
29
+
30
+ # Uses
31
+
32
+ This model can be used as base to downstream task as instructed by Flan-T5 paper
33
+
34
+ # Bias, Risks, and Limitations
35
+
36
+ Due to the nature of the web-scraped corpus on which Flan-T5 models were trained, it is likely that their usage could reproduce and amplify
37
+ pre-existing biases in the data, resulting in potentially harmful content such as racial or gender stereotypes and conspiracist views. For this reason,
38
+ the study of such biases is explicitly encouraged, and model usage should ideally be restricted to research-oriented and non-user-facing endeavors.
39
+
40
+ ## How to Get Started with the Model
41
+
42
+ Use the code below to get started with the model.
43
+
44
+ ```
45
+ from transformers import FlaxT5ForConditionalGeneration
46
+
47
+ model_flax = FlaxT5ForConditionalGeneration.from_pretrained("thegoodfellas/tgf-flan-t5-base-ptbr")
48
+
49
+ ```
50
+
51
+ # Training Details
52
+
53
+ ## Training Data
54
+
55
+ The training was performed from two datasets, BrWac and Oscar (Portuguese section).
56
+
57
+ ## Training Procedure
58
+
59
+ We trained this model by 1 epoch on each dataset.
60
+
61
+
62
+ ### Training Hyperparameters
63
+
64
+ Thanks to TPU Research Cloud we were able to train this model on TPU. 1 single TPUv2-8
65
+
66
+ - **Training regime:**
67
+ - Precision: bf16
68
+ - Batch size: 32
69
+ - LR: 0,005
70
+ - Warmup steps: 10_000
71
+ - Epochs: 1 (each dataset)
72
+ - Optimizer: Adafactor
73
+
74
+ # Environmental Impact
75
+
76
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
77
+
78
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
79
+
80
+ Experiments were conducted using Google Cloud Platform in region us-central1, which has a carbon efficiency of 0.57 kgCO$_2$eq/kWh.
81
+ A cumulative of 50 hours of computation was performed on hardware of type TPUv2 Chip (TDP of 221W).
82
+
83
+ Total emissions are estimated to be 6.3 kgCO$_2$eq of which 100 percents were directly offset by the cloud provider.
84
+
85
+
86
+ - **Hardware Type:** TPUv2
87
+ - **Hours used:** 50
88
+ - **Cloud Provider:** GCP
89
+ - **Compute Region:** us-central1
90
+ - **Carbon Emitted:** 6.3 kgCO$_2$eq
91
+
92
+ # Technical Specifications [optional]
93
+
94
+ ## Model Architecture and Objective
95
+
96
+ Flan-T5
97
+
98
+
99
+
100
+