lucasbiagettia commited on
Commit
b09429a
1 Parent(s): 72b8f73

End of training

Browse files
README.md CHANGED
@@ -1,69 +1,51 @@
1
  ---
2
- language:
3
- - es
4
  license: apache-2.0
 
5
  tags:
6
- - "borges"
7
- - "spanish"
8
- - "text-generation"
9
- datasets:
10
- - "borges_works"
11
- widget:
12
- - text: "El modelo del lenguaje GPT es capaz de"
13
- - text: "Las obras de Borges son una fuente rica de conocimiento y creatividad"
14
  ---
15
 
16
- # GPT2-borges (gpt2-borges)
 
17
 
 
18
 
19
- ## Overview
20
-
21
- - **Architecture:** gpt2-base
22
- - **Language:** Spanish
23
- - **Task:** text-generation
24
- - **Data:** Borges Works
25
 
26
  ## Model description
27
 
28
- **GPT2-borges** is a transformer-based model for the Spanish language. It is based on the PlanTL-GOB-ES/gpt2-base-bne model and has been pre-trained using a curated dataset consisting of the complete works of Jorge Luis Borges, a renowned Argentine writer.
 
 
 
 
 
 
29
 
30
- ## Intended uses and limitations
31
- You can use the raw model for text generation or fine-tune it to a downstream task.
32
 
33
- ## How to Use
34
- Here is how to use this model:
35
 
36
- You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we set a seed for reproducibility:
37
 
38
- ```python
39
- from transformers import AutoTokenizer, AutoModelForCausalLM
40
- model_name = 'lucasbiagettia/gpt2-base-borges'
41
- tokenizer = AutoTokenizer.from_pretrained(model_name)
42
- model = AutoModelForCausalLM.from_pretrained(model_name)
43
- ```
 
 
44
 
45
- ```python
46
- input_text = "La arena me recuerda el profundo dolor de la nostalgia"
47
 
48
- input_ids = self.tokenizer.encode(input_text, return_tensors="pt")
49
- attention_mask = torch.ones(input_ids.shape, dtype=torch.long)
50
-
51
 
52
- generated_text = self.model.generate(
53
- input_ids=input_ids,
54
- attention_mask=attention_mask,
55
- max_new_tokens = 100,
56
- num_return_sequences=1,
57
- no_repeat_ngram_size=6,
58
- top_k=35,
59
- top_p=0.95,
60
- temperature=0.8,
61
- pad_token_id=50256,
62
- do_sample=True,
63
- )
64
- ```
65
 
66
- ## Training
67
- was trained with the following dataset:
68
 
69
- https://github.com/lucasbiagettia/borges_plain_text_dataset
 
 
 
1
  ---
 
 
2
  license: apache-2.0
3
+ base_model: lucasbiagettia/gpt2-base-borges
4
  tags:
5
+ - generated_from_trainer
6
+ model-index:
7
+ - name: gpt2-base-borges
8
+ results: []
 
 
 
 
9
  ---
10
 
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
 
14
+ # gpt2-base-borges
15
 
16
+ This model is a fine-tuned version of [lucasbiagettia/gpt2-base-borges](https://huggingface.co/lucasbiagettia/gpt2-base-borges) on an unknown dataset.
 
 
 
 
 
17
 
18
  ## Model description
19
 
20
+ More information needed
21
+
22
+ ## Intended uses & limitations
23
+
24
+ More information needed
25
+
26
+ ## Training and evaluation data
27
 
28
+ More information needed
 
29
 
30
+ ## Training procedure
 
31
 
32
+ ### Training hyperparameters
33
 
34
+ The following hyperparameters were used during training:
35
+ - learning_rate: 5e-05
36
+ - train_batch_size: 8
37
+ - eval_batch_size: 8
38
+ - seed: 42
39
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
40
+ - lr_scheduler_type: linear
41
+ - num_epochs: 1
42
 
43
+ ### Training results
 
44
 
 
 
 
45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
+ ### Framework versions
 
48
 
49
+ - Transformers 4.38.2
50
+ - Pytorch 2.2.1+cu121
51
+ - Tokenizers 0.15.2
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "PlanTL-GOB-ES/gpt2-base-bne",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2LMHeadModel"
@@ -35,7 +35,7 @@
35
  }
36
  },
37
  "torch_dtype": "float32",
38
- "transformers_version": "4.35.2",
39
  "use_cache": true,
40
  "vocab_size": 50261
41
  }
 
1
  {
2
+ "_name_or_path": "lucasbiagettia/gpt2-base-borges",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2LMHeadModel"
 
35
  }
36
  },
37
  "torch_dtype": "float32",
38
+ "transformers_version": "4.38.2",
39
  "use_cache": true,
40
  "vocab_size": 50261
41
  }
generation_config.json CHANGED
@@ -3,5 +3,5 @@
3
  "bos_token_id": 2,
4
  "eos_token_id": 2,
5
  "pad_token_id": 1,
6
- "transformers_version": "4.35.2"
7
  }
 
3
  "bos_token_id": 2,
4
  "eos_token_id": 2,
5
  "pad_token_id": 1,
6
+ "transformers_version": "4.38.2"
7
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4f244f5a8ffc6e11a5ebe0d680c59ca8d435d58f104dcc4b3670a29dfd0f3538
3
  size 496213632
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b4d99981215b97db1a60d20005d9a12065789bb1975ba13974181455d9755583
3
  size 496213632
runs/Mar22_23-58-05_9aca86da8fe2/events.out.tfevents.1711151885.9aca86da8fe2.5226.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7220a95965af40eba7a0dc2ddf8cd18b209d05bc7bbb3c08f988df10a024bb30
3
+ size 5231
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c53e1863a1a7f3da730f829d2615da843002d73c0cf1cc85d9cef2cadddaba10
3
+ size 4856