erfanzar
/

PGT-1B-2EP

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

erfanzar commited on Apr 26, 2023

Commit

b3f4be0

•

1 Parent(s): e7c8cf4

Create README.md

Files changed (1) hide show

README.md +89 -0

README.md ADDED Viewed

	@@ -0,0 +1,89 @@

+---
+license: apache-2.0
+datasets:
+- OpenAssistant/oasst1
+- erfanzar/CC-H2OAI-OASST-1-TRAIN
+- erfanzar/CC-OASST-1-TRAIN
+language:
+- en
+- fr
+- fa
+- nl
+metrics:
+- bertscore
+pipeline_tag: text-generation
+---
+## Hello community
+this model is only 1B but you can call it somehow an SOTA
+this model can also run on 4 GB GPU RAM and know dialogs as well
+## Usage Code
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from IPython.display import clear_output
+import textwrap
+tokenizer = AutoTokenizer.from_pretrained("erfanzar/PGT-1B-2EP")
+model = AutoModelForCausalLM.from_pretrained("erfanzar/PGT-1B-2EP",device_map='auto',load_in_8bit=True)
+verify_text = lambda txt : '\n'.join([textwrap.fill(txt, width=140) for txt in txt.split('\n')])
+def ppp(text:str):
+  """
+  pre processing prompt
+  """
+  return f"<|prompter|>{text}<|endoftext|><|assistant|>"
+def generate(text,max_new_tokens:int=512,use_ppp:bool=False,b_pair=False):
+  text = ppp(text) if use_ppp else text
+  for i in range(max_new_tokens):
+    enc = tokenizer(text,return_tensors='pt')
+    text_r = text
+    enc = model.generate(**enc,max_new_tokens=1,pad_token_id=0)
+    text = tokenizer.decode(enc[0])
+    if text.endswith(tokenizer.eos_token):
+      break
+    else:
+      yield text[len(text_r):] if b_pair else text
+for v in generate('where is empire building ?',512,True):
+  clear_output(wait=True)
+  print(verify_text(v),end='')
+```
+# Pythia-1B
+## Model Details
+### Pretrained Model
+  - Developed by: [EleutherAI](http://eleuther.ai)
+  - Model type: Transformer-based Language Model
+  - FineTuned Languages: English , Persian , French, And Dutch
+  - Learn more: [Pythia's GitHub repository](https://github.com/EleutherAI/pythia) for training procedures, config files, and details on how to use.
+  - Library: [GPT-NeoX](https://github.com/EleutherAI/gpt-neox)
+  - License: [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
+## NOTE
+The Pythia Suite is **NOT** intended for deployment. It is not in itself
+a product and cannot be used for human-facing interactions. For example,
+the model may generate harmful or offensive text...
+and also remember that this model is not good enough for Persian, French, and Dutch at least for this version