File size: 2,772 Bytes
b3f4be0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a26d735
 
8fa82f2
a26d735
b3f4be0
 
 
 
 
 
 
950c245
 
 
 
 
 
 
 
 
 
b3f4be0
 
 
 
0a5fa50
b3f4be0
 
 
 
 
 
 
 
 
ad8d01a
 
b3f4be0
 
 
 
 
 
0a5fa50
b3f4be0
0a5fa50
b3f4be0
 
 
0a5fa50
b3f4be0
0a5fa50
ad8d01a
 
 
 
 
b3f4be0
 
 
 
0a5fa50
 
b3f4be0
 
 
0a5fa50
b3f4be0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
---
license: apache-2.0
datasets:
- OpenAssistant/oasst1
- erfanzar/CC-H2OAI-OASST-1-TRAIN
- erfanzar/CC-OASST-1-TRAIN
language:
- en
- fr
- fa
- nl
metrics:
- bertscore
pipeline_tag: text-generation
---


# OpenSourceTransformers-OST Project

[OST-OpenSourceTransformers Github](https://github.com/erfanzar/OST-OpenSourceTransformers)

## Hello community 

this model is only 1B but you can call it somehow an SOTA


this model can also run on 4 GB GPU RAM and know dialogs as well


### Train Parametes

- learning-rate : 2e-4
- sc : cosine lr
- device : T4 GPU * 4
- batch-size: AutoFind
- train time 12 H
- max sequence length: 1024
- epochs : 2
## Usage Code

```python


from transformers import AutoTokenizer, AutoModelForCausalLM
from IPython.display import clear_output
import textwrap


tokenizer = AutoTokenizer.from_pretrained("erfanzar/PGT-1B-2EP")

model = AutoModelForCausalLM.from_pretrained("erfanzar/PGT-1B-2EP",device_map='auto',load_in_8bit=True)


verify_text = lambda txt : '\n'.join([textwrap.fill(txt, width=110) for txt in txt.split('\n')])


def ppp(text:str):
  """
  pre processing prompt
  """
  return f"<|prompter|> {text} <|endoftext|><|assistant|>"

def generate(text,max_new_tokens:int=1024,use_ppp:bool=False,b_pair=False):
  text = ppp(text) if use_ppp else text
  
  for i in range(max_new_tokens):
    enc = tokenizer(text,return_tensors='pt',add_special_tokens=False)
    text_r = text
    enc = model.generate(enc.input_ids,max_new_tokens=1,pad_token_id=0)
    text = tokenizer.decode(enc[0],skip_special_tokens=False)
    text = text[:-4]+tokenizer.eos_token if text[-4:] == '\n\n\n\n' else text
  
    if text.endswith(tokenizer.eos_token) or text.endswith('\n\n\n\n\n'):
      yield text[len(text_r):] if b_pair else text
      break
    else:
      yield text[len(text_r):] if b_pair else text


for v in generate('what is a gpu',512,True):
  clear_output(wait=True)
  print(verify_text(v),end='')


```

# Pythia-1B

## Model Details

### Pretrained Model

  - Developed by: [EleutherAI](http://eleuther.ai)
  - Model type: Transformer-based Language Model
  - FineTuned Languages: English , Persian , French, And Dutch
  - Learn more: [Pythia's GitHub repository](https://github.com/EleutherAI/pythia) for training procedures, config files, and details on how to use.
  - Library: [GPT-NeoX](https://github.com/EleutherAI/gpt-neox)
  - License: [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)



## NOTE

The Pythia Suite is **NOT** intended for deployment. It is not in itself 
a product and cannot be used for human-facing interactions. For example, 
the model may generate harmful or offensive text...


and also remember that this model is not good enough for Persian, French, and Dutch at least for this version