Edit model card

this model uses Task classification and the conversation is between USER and Answer or AI


THE JAX/FLAX version of model is available both for training and usage And This model support context length of 3300

this model support run with OST_UI so heres how to run it with just one command

git clone https://github.com/erfanzar/OST-OpenSourceTransformers
cd OST-OpenSourceTransformers/
python3 OST_UI/app.py --model_id='erfanzar/chatLGeM' --num_gpus <NUMBER OF GPUS TO USE>

Examples πŸš€

</s><|prompter|> TEXT </s><|assistant|>

or Just Simply Open GOOGLE COLAB πŸš€πŸš€

Generate Method to get res Text by Text

def generate(model_,input_ids_,tokeinzer_,max_length:int=3300,temperature :float= 0.2,eos_token_id:int=2):
  with torch.no_grad():
    before_start = len(input_ids_[0])+1
    for _ in range(max_length):
      out = model_(
      opa = torch.nn.functional.softmax(out.logits[:,-1,:]/temperature)
      Camila = torch.multinomial(opa,1)
      input_ids_ = torch.cat([input_ids_,Camila],-1)
      if Camila[0].item() == eos_token_id:
      yield tokeinzer_.decode(Camila[0],skip_special_tokens=True)
  return f"{tokeinzer_.decode(input_ids_[0],skip_special_tokens=True)[before_start:]}"


import socket
import time

def check_internet_connection():
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.connect(("www.google.com", 80))
        print("Internet connection is active.")
        print("Internet connection is not active.")

if __name__ == "__main__":


Using Model in OST

LGeM πŸš€

  • what is LGeM, LGeM is a CausalLM Model that is trained on self instruct data (Alpaca data) and for initialization of the first train of the main model (weights are available) I used pre weights from Alpaca LoRA (open source)

  • it's Decoder Only

  • built-in Pytorch and Jax

  • you can simply import models like (In EasyDeL or OST Library)

# Pytorch
from modules import LGeMForCausalLM
# Jax
from modules import FlaxLGeMForCausalLM
  • and Training code is available at jax_train.py (check source)
  • training parameters
    • learning rate 2e-5
    • Optimizer AdamW
    • batch 32
    • TPU POD
    • Train Time 50 hours
    • budget 500 $
python3 LGeM-train.py
Downloads last month

Datasets used to train erfanzar/llama-chat