Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

GenZ 13B v2 4bit

The instruction finetuned model with 4K input length. The model is finetuned on top of pretrained LLaMa2

Inference

from transformers import LlamaForCausalLM, LlamaTokenizer
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig

base_model = 'budecosystem/genz-13b-v2-4bit'

tokenizer = LlamaTokenizer.from_pretrained(base_model)

model = AutoGPTQForCausalLM.from_quantized(model_name_or_path=base_model,
        model_basename="gptq_model-4bit-128g",
        use_safetensors=True,
        trust_remote_code=True)

prompt = """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
USER: who are you? ASSISTANT: """
inputs = tokenizer(prompt, return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))

Use following prompt template

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi, how are you? ASSISTANT: 

Finetuning

python finetune.py
   --model_name meta-llama/Llama-2-13b
   --data_path dataset.json
   --output_dir output
   --trust_remote_code
   --prompt_column instruction
   --response_column output

Check the GitHub for the code -> GenZ

Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.