crscardellino commited on
Commit
b656251
2 Parent(s): 1f5c8b8 a15aee5

Merge branch 'main' of https://huggingface.co/crscardellino/flisol-cba-martin-fierro

Browse files
Files changed (5) hide show
  1. README.md +57 -17
  2. config.json +40 -0
  3. generation_config.json +6 -0
  4. pytorch_model.bin +3 -0
  5. training_args.bin +3 -0
README.md CHANGED
@@ -1,25 +1,65 @@
1
- Hugging Face: IA Colaborativa
2
- =============================
 
 
 
 
 
 
3
 
4
- En este repositorio estará disponible el código y modelo que entrené para la
5
- charla ["Hugging Face: IA Colaborativa"](https://eventol.flisol.org.ar/events/cordoba2023/activity/378/)
6
- del [FLISoL de Córdoba](https://cordoba.flisol.org.ar), Argentina, de 2023.
7
 
8
- Para inicializar el setup hace falta tener instalado y activado
9
- [`git-lfs`](https://git-lfs.com/).
10
 
11
- Pueden clonar el repositorio con:
 
 
12
 
13
- $ git clone https://huggingface.co/crscardellino/flisol-cba-martin-fierro
14
 
15
- Y luego crean el entorno e instalan los requerimientos.
16
 
17
- $ python -m venv flisol-venv
18
- $ source ./flisol-venv/bin/activate
19
- (flisol-venv) $ pip install -r requirements.txt
20
 
21
- El código está probado con Python 3.10, pero debería funcionar con Python >=
22
- 3.8. En los requerimientos está organizado para instalar
23
- [PyTorch](https://pytorch.org/) v2.0.0 para cpu, pero pueden ajustarlo para
24
- utilizar GPUs suponiendo que cumplan los requerimientos de CUDA.
25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - generated_from_trainer
5
+ model-index:
6
+ - name: flisol-cba-martin-fierro
7
+ results: []
8
+ ---
9
 
10
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
+ should probably proofread and complete it, then remove this comment. -->
 
12
 
13
+ # flisol-cba-martin-fierro
 
14
 
15
+ This model is a fine-tuned version of [DeepESP/gpt2-spanish](https://huggingface.co/DeepESP/gpt2-spanish) on the None dataset.
16
+ It achieves the following results on the evaluation set:
17
+ - Loss: 3.9067
18
 
19
+ ## Model description
20
 
21
+ More information needed
22
 
23
+ ## Intended uses & limitations
 
 
24
 
25
+ More information needed
 
 
 
26
 
27
+ ## Training and evaluation data
28
+
29
+ More information needed
30
+
31
+ ## Training procedure
32
+
33
+ ### Training hyperparameters
34
+
35
+ The following hyperparameters were used during training:
36
+ - learning_rate: 2e-05
37
+ - train_batch_size: 8
38
+ - eval_batch_size: 8
39
+ - seed: 42
40
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
41
+ - lr_scheduler_type: linear
42
+ - num_epochs: 10
43
+
44
+ ### Training results
45
+
46
+ | Training Loss | Epoch | Step | Validation Loss |
47
+ |:-------------:|:-----:|:----:|:---------------:|
48
+ | 4.3864 | 1.0 | 18 | 4.2025 |
49
+ | 3.948 | 2.0 | 36 | 4.0440 |
50
+ | 3.7962 | 3.0 | 54 | 3.9804 |
51
+ | 3.6105 | 4.0 | 72 | 3.9458 |
52
+ | 3.4444 | 5.0 | 90 | 3.9280 |
53
+ | 3.3855 | 6.0 | 108 | 3.9192 |
54
+ | 3.3142 | 7.0 | 126 | 3.9091 |
55
+ | 3.2192 | 8.0 | 144 | 3.9074 |
56
+ | 3.1615 | 9.0 | 162 | 3.9070 |
57
+ | 3.1637 | 10.0 | 180 | 3.9067 |
58
+
59
+
60
+ ### Framework versions
61
+
62
+ - Transformers 4.28.1
63
+ - Pytorch 2.0.0+cpu
64
+ - Datasets 2.11.0
65
+ - Tokenizers 0.13.3
config.json ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "DeepESP/gpt2-spanish",
3
+ "activation_function": "gelu_new",
4
+ "architectures": [
5
+ "GPT2LMHeadModel"
6
+ ],
7
+ "attn_pdrop": 0.1,
8
+ "bos_token_id": 50256,
9
+ "embd_pdrop": 0.1,
10
+ "eos_token_id": 50256,
11
+ "gradient_checkpointing": false,
12
+ "initializer_range": 0.02,
13
+ "layer_norm_epsilon": 1e-05,
14
+ "model_type": "gpt2",
15
+ "n_ctx": 1024,
16
+ "n_embd": 768,
17
+ "n_head": 12,
18
+ "n_inner": null,
19
+ "n_layer": 12,
20
+ "n_positions": 1024,
21
+ "reorder_and_upcast_attn": false,
22
+ "resid_pdrop": 0.1,
23
+ "scale_attn_by_inverse_layer_idx": false,
24
+ "scale_attn_weights": true,
25
+ "summary_activation": null,
26
+ "summary_first_dropout": 0.1,
27
+ "summary_proj_to_labels": true,
28
+ "summary_type": "cls_index",
29
+ "summary_use_proj": true,
30
+ "task_specific_params": {
31
+ "text-generation": {
32
+ "do_sample": true,
33
+ "max_length": 50
34
+ }
35
+ },
36
+ "torch_dtype": "float32",
37
+ "transformers_version": "4.28.1",
38
+ "use_cache": true,
39
+ "vocab_size": 50257
40
+ }
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 50256,
4
+ "eos_token_id": 50256,
5
+ "transformers_version": "4.28.1"
6
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1e06b84b625ee566a1e8dcc2c7dd064d63481f644349828709ab7757e9dfce0
3
+ size 510395581
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:402f127db0f19abbb4782498dd89f615bb96f958fb37ddfd8087acf6cc097fe4
3
+ size 3579