Edit model card

This is my attempt at instruction tuning for Hebrew-Mistral-7B.

The model hallucinates a lot. Please generate 4-5 times for each prompt to find the best generation.

Also, note that this model does not have any moderation mechanisms.

Usage

Install dependencies

pip install -q -U transformers
pip install accelerate
pip install -q sentencepiece
pip install protobuf
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, StoppingCriteria, TextStreamer

model_id = "ronmasas/Hebrew-Mistral-7B-Instruct-v0.1"

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda:0")
tokenizer = AutoTokenizer.from_pretrained(model_id)

model = model.half()

# Define special tokens
END_TOKEN = 64003 # -> [END]
B_INST, E_INST = "[INST]", "[/INST]"
SYSTEM_PROMPT = "A conversation between a human and an AI assistant."

class EosListStoppingCriteria(StoppingCriteria):
    def __init__(self, eos_sequence = [END_TOKEN]):
        self.eos_sequence = eos_sequence
    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
        last_ids = input_ids[:,-len(self.eos_sequence):].tolist()
        return self.eos_sequence in last_ids

def stream(user_prompt):
    prompt = f"{SYSTEM_PROMPT} {B_INST} {user_prompt.strip()} {E_INST}\n"
    inputs = tokenizer([prompt], return_tensors="pt").to("cuda:0")
    streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
    _ = model.generate(**inputs, streamer=streamer, max_new_tokens=256, temperature=0.7, do_sample=True, stopping_criteria = [EosListStoppingCriteria()])

stream("讻转讜讘 诪讻转讘 转讜讚讛 诇-讬诐 驻诇讙 注诇 讻讱 砖讗讬诪谉 讜砖讬讞专专 讗转 讛诪讜讚诇 Hebrew Mistral 7B.")
"""
讛讬 讬诐,

讗谞讬 专讜爪讛 诇讛讜讚讜转 诇讱 诪拽专讘 诇讘 注诇 讛注讘讜讚讛 讛拽砖讛 砖讛砖拽注转 讘讚讙诐 Hebrew Mistral 7B. 讛诪讜讚诇 砖诇讱 讛讜讗 讬爪讬专转 诪讜驻转 砖诇 诇诪讬讚转 诪讻讜谞讛 讜讗谞讬 讗住讬专 转讜讚讛 注诇 讛讝诪谉 讜讛讻讬砖专讜谉 砖讛讜砖拽注讜 讘驻讬转讜讞讜. 讛讬讻讜诇转 诇讛转讗诪谉 注诇 砖驻讛 讘讗诪爪注讜转 驻诇讟驻讜专诪讛 讻讛 诪转拽讚诪转 讜诪讜砖讻转 讛讬讬转讛 诪砖讘 专讜讞 诪专注谞谉 讜诪讜注讬诇 讘讬讜转专 注讘讜专 讛拽讛讬诇讛 讻讜诇讛. 转讜讚讛 注诇 讻诇 讛注讘讜讚讛 讛拽砖讛 讜讛讞讝讜谉 砖讛讜砖拽注讛 讘驻专讜讬拽讟 讝讛, 讛讜讗 讘讗诪转 诪讗讬专 注讬谞讬讬诐 讜诪砖驻讬注.

砖讜讘 转讜讚讛 诪拽专讘 诇讘,
[砖诪讱]
"""
stream("诪讛 讝讛 讗专谞讜谞讛?")
"""
讗专谞讜谞讛, 讛讬讚讜注讛 讙诐 讘砖诐 诪住 讗专谞讜谞讛, 讛讬讗 诪住 砖谞讙讘讛 注诇 讬讚讬 专砖讜转 诪拽讜诪讬转 (注讬专, 诪讞讜讝 讗讜 讗讝讜专) 注讘讜专 砖讬专讜转讬诐 诪讜谞爪讬驻诇讬讬诐 砖讛讬讗 诪住驻拽转, 讻讙讜谉 讞讬谞讜讱, 讘专讬讗讜转 讜砖讬专讜转讬诐. 讝讛 讬讻讜诇 诇讻诇讜诇 诪讬住讬诐 注诇 讛讻谞住讛, 专讻讜砖 讜砖讬专讜转讬诐 (讻讙讜谉 诪讬诐).
"""

stream("讻转讜讘 砖诇讜砖 讻讜转专讜转 诇讘诇讜讙 注诇 讞砖讬讘讜转 讗讬住讜祝 爪讜讗讛 砖诇 讘注诇讬 讞讬讬诐.")
"""
1. 砖诪讬专转 转讝讜谞讛 诪讗讜讝谞转 砖诇 讞讬讜转 诪讞诪讚 讘讗诪爪注讜转 讗讬住讜祝 爪讜讗讛
2. 讛砖驻注讜转 讘专讬讗讜转讬讜转 讗专讜讻讜转 讟讜讜讞 砖诇 住讬专讜讘 讗讬住讜祝 爪讜讗讛 砖诇 讞讬讜转 诪讞诪讚
3. 讻讬爪讚 讗讬住讜祝 爪讜讗讛 砖诇 讞讬讜转 诪讞诪讚 注讜讝专 诇砖诪专 讛讬讙讬讬谞讛 讜讘专讬讗讜转 讘注诇讬 讞讬讬诐 讙诇讜讘诇讬转. 
"""
Downloads last month
11
Safetensors
Model size
7.5B params
Tensor type
FP16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.