AlexWortega commited on
Commit
214d407
1 Parent(s): 9d0c524

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -0
README.md CHANGED
@@ -2,6 +2,18 @@
2
  license: apache-2.0
3
  datasets:
4
  - IlyaGusev/rulm
 
 
 
 
 
 
 
 
 
 
 
 
5
  language:
6
  - ru
7
  library_name: transformers
@@ -10,3 +22,58 @@ tags:
10
  - finance
11
  - code
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  datasets:
4
  - IlyaGusev/rulm
5
+ inference:
6
+ parameters:
7
+ min_length: 20
8
+ max_new_tokens: 250
9
+ top_k: 50
10
+ top_p: 0.9
11
+ early_stopping: true
12
+ no_repeat_ngram_size: 2
13
+ use_cache: true
14
+ repetition_penalty: 1.5
15
+ length_penalty: 0.8
16
+ num_beams: 2
17
  language:
18
  - ru
19
  library_name: transformers
 
22
  - finance
23
  - code
24
  ---
25
+
26
+ <h1 style="font-size: 42px">WortegaLM 109m<h1/>
27
+
28
+
29
+
30
+ # Model Summary
31
+
32
+ > Это GPTneo like модель обученная с нуля на сете в 95gb кода, хабра, пикабу, новостей. Она умеет решать примитивные задачи, не пригодна для ZS FS, но идеальна как модель для студенческих проектов
33
+
34
+ # Quick Start
35
+
36
+
37
+ ```python
38
+
39
+
40
+
41
+
42
+
43
+ from transformers import AutoTokenizer, AutoModelForCausalLM,
44
+
45
+
46
+ tokenizer = AutoTokenizer.from_pretrained('AlexWortega/wortegaLM',padding_side='left')
47
+ device = 'cuda'
48
+ model = AutoModelForCausalLM.from_pretrained('AlexWortega/wortegaLM')
49
+ model.resize_token_embeddings(len(tokenizer))
50
+ model.to(device)
51
+
52
+
53
+
54
+ def generate_seqs(q,model, k=2):
55
+ gen_kwargs = {
56
+ "min_length": 20,
57
+ "max_new_tokens": 100,
58
+ "top_k": 50,
59
+ "top_p": 0.7,
60
+ "do_sample": True,
61
+ "early_stopping": True,
62
+ "no_repeat_ngram_size": 2,
63
+ "eos_token_id": tokenizer.eos_token_id,
64
+ "pad_token_id": tokenizer.eos_token_id,
65
+ "use_cache": True,
66
+ "repetition_penalty": 1.5,
67
+ "length_penalty": 1.2,
68
+ "num_beams": 4,
69
+ "num_return_sequences": k
70
+ }
71
+ q = q
72
+ t = tokenizer.encode(q, add_special_tokens=False, return_tensors='pt').to(device)
73
+ g = model.generate(t, **gen_kwargs)
74
+ generated_sequences = tokenizer.batch_decode(g, skip_special_tokens=False)
75
+
76
+ return generated_sequences
77
+
78
+ ```
79
+