flexudy commited on
Commit
8b97a48
β€’
1 Parent(s): 23f0026

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -21
README.md CHANGED
@@ -1,18 +1,22 @@
1
  # Cheapity3 🐷
2
- "GPT-like" T5 model trained to generate text in multiple languages.
 
3
 
4
  ## Motivation
 
5
  - GPT models are expensive run.
6
  - GPT models are monolingual.
7
 
8
  ## Solution
9
- - Maybe, Small Models aren't Terrible (*SMarT*)
 
10
  - Plus, they are cheaper to run.
11
 
12
- I fine-tuned T5 on multiple languages (πŸ‡¬πŸ‡§ English, πŸ‡©πŸ‡ͺ German, πŸ‡«πŸ‡· French) and multiple academic text snippets from various
13
- domains like tech, law, finance and science etc. to generate text, just like GPT models do.
14
 
15
  ## Usage
 
16
  - Provide some text e.g `"Italy, officially the Italian Republic is a country consisting of"`
17
  - Tell Cheapity3 how many words you want to generate e.g `15` -- πŸ˜ƒ Yes, you can control the length.
18
  - Cheapity3 reads your text and generates a continuation containing approximately 15 words.
@@ -24,31 +28,33 @@ tokenizer = AutoTokenizer.from_pretrained("flexudy/cheapity3")
24
 
25
  model = AutoModelWithLMHead.from_pretrained("flexudy/cheapity3")
26
 
27
- input_text = "guess: Italy, officially the Italian Republic is a country consisting of { _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ }" # 15 words
 
 
28
 
29
- inputs = tokenizer.encode(input_text, return_tensors="pt", truncation=True, max_length=512)
30
 
31
  input_ids = inputs["input_ids"]
32
 
33
  attention_mask = inputs["attention_mask"]
34
 
35
  outputs = model.generate(
36
- input_ids=input_ids,
37
- attention_mask=attention_mask,
38
- max_length=128,
39
- do_sample=True,
40
- early_stopping=True,
41
- num_return_sequences=4,
42
- repetition_penalty=2.5
43
- )
44
 
45
  for i in range(4):
46
  print(tokenizer.decode(outputs[i], skip_special_tokens=True, clean_up_tokenization_spaces=True))
47
 
48
- # >
49
- # >
50
- # >
51
- # >
52
- ```
53
-
54
- #
 
1
  # Cheapity3 🐷
2
+
3
+ GPT3-like T5 model trained to generate text in multiple languages.
4
 
5
  ## Motivation
6
+
7
  - GPT models are expensive run.
8
  - GPT models are monolingual.
9
 
10
  ## Solution
11
+
12
+ - Maybe, Small Models aren't Terrible (*SMarT*)
13
  - Plus, they are cheaper to run.
14
 
15
+ I fine-tuned T5 on multiple languages (πŸ‡¬πŸ‡§ English, πŸ‡©πŸ‡ͺ German, πŸ‡«πŸ‡· French) and multiple academic text snippets from
16
+ various domains like tech, law, finance and science etc. to generate text, just like GPT models do.
17
 
18
  ## Usage
19
+
20
  - Provide some text e.g `"Italy, officially the Italian Republic is a country consisting of"`
21
  - Tell Cheapity3 how many words you want to generate e.g `15` -- πŸ˜ƒ Yes, you can control the length.
22
  - Cheapity3 reads your text and generates a continuation containing approximately 15 words.
 
28
 
29
  model = AutoModelWithLMHead.from_pretrained("flexudy/cheapity3")
30
 
31
+ input_text = """The mechanical engineering field requires an understanding of core areas including mechanics, dynamics,
32
+ thermodynamics, materials science, structural analysis, and
33
+ electricity. { _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ }""" # 15 words
34
 
35
+ inputs = tokenizer(input_text, return_tensors="pt", truncation=True, max_length=512)
36
 
37
  input_ids = inputs["input_ids"]
38
 
39
  attention_mask = inputs["attention_mask"]
40
 
41
  outputs = model.generate(
42
+ input_ids=input_ids,
43
+ attention_mask=attention_mask,
44
+ max_length=128,
45
+ do_sample=True,
46
+ early_stopping=True,
47
+ num_return_sequences=4,
48
+ repetition_penalty=2.5
49
+ )
50
 
51
  for i in range(4):
52
  print(tokenizer.decode(outputs[i], skip_special_tokens=True, clean_up_tokenization_spaces=True))
53
 
54
+ # Italy, officially the Italian Republic is a country consisting of
55
+ # > Cheapity3 continuing:
56
+ # ... Developing the knowledge base for these core areas will enable engineers to build their capabilities rapidly and efficiently. ...
57
+ # ... The field of mechanics offers a variety and broad range for applications throughout the engineering/technological fields. ...
58
+ # ... Mechanics generally is not understood by students. While they can be employed in the field, mechanical engineering ...
59
+ # ... Introduction to mechanical engineering and core fields including chemical products, materials science, structural analysis, and geomatics ...
60
+ ```