Edit model card

this model uses Task classification and the conversation is between USER and Answer or AI

NOTE ⚠️

THE JAX/FLAX version of model is available both for training and usage And This model support context length of 3300

this model support run with OST_UI so heres how to run it with just one command

git clone https://github.com/erfanzar/OST-OpenSourceTransformers
cd OST-OpenSourceTransformers/
python3 OST_UI/app.py --model_id='erfanzar/chatLGeM' --num_gpus <NUMBER OF GPUS TO USE>

Examples πŸš€

</s><|prompter|> TEXT </s><|assistant|>

or Just Simply Open GOOGLE COLAB πŸš€πŸš€

Generate Method to get res Text by Text


def generate(model_,input_ids_,tokeinzer_,max_length:int=3300,temperature :float= 0.2,eos_token_id:int=2):
  with torch.no_grad():
    before_start = len(input_ids_[0])+1
    for _ in range(max_length):
      out = model_(
          input_ids=input_ids_,
          return_dict=True,
      )
      opa = torch.nn.functional.softmax(out.logits[:,-1,:]/temperature)
      Camila = torch.multinomial(opa,1)
      input_ids_ = torch.cat([input_ids_,Camila],-1)
      clear_output(wait=True)
      print(f"\r{tokeinzer_.decode(input_ids_[0],skip_special_tokens=True)[before_start:]}",end='')
      if Camila[0].item() == eos_token_id:
        break
      yield tokeinzer_.decode(Camila[0],skip_special_tokens=True)
  return f"{tokeinzer_.decode(input_ids_[0],skip_special_tokens=True)[before_start:]}"

Result

import socket
import time

def check_internet_connection():
    try:
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.connect(("www.google.com", 80))
        print("Internet connection is active.")
    except:
        print("Internet connection is not active.")

if __name__ == "__main__":

  check_internet_connection()

Using Model in OST

LGeM πŸš€

  • what is LGeM, LGeM is a CausalLM Model that is trained on self instruct data (Alpaca data) and for initialization of the first train of the main model (weights are available) I used pre weights from Alpaca LoRA (open source)

  • it's Decoder Only

  • built-in Pytorch and Jax

  • you can simply import models like (In EasyDeL or OST Library)

# Pytorch
from modules import LGeMForCausalLM
# Jax
from modules import FlaxLGeMForCausalLM
  • and Training code is available at jax_train.py (check source)
  • training parameters
    • learning rate 2e-5
    • Optimizer AdamW
    • batch 32
    • TPU POD
    • Train Time 50 hours
    • budget 500 $
python3 LGeM-train.py
Downloads last month
6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train erfanzar/llama-chat

Space using erfanzar/llama-chat 1