https://github.com/zejunwang1/bloom_tuning

可δ»₯ι€šθΏ‡ε¦‚δΈ‹δ»£η θ°ƒη”¨ bloom-820m-chat ζ¨‘εž‹ζ₯η”Ÿζˆε―Ήθ―οΌš

from transformers import BloomTokenizerFast, BloomForCausalLM

model_name_or_path = "WangZeJun/bloom-820m-chat"

tokenizer = BloomTokenizerFast.from_pretrained(model_name_or_path)
model = BloomForCausalLM.from_pretrained(model_name_or_path).cuda()
model = model.eval()

input_pattern = "{}</s>"
text = "δ½ ε₯½"
input_ids = tokenizer(input_pattern.format(text), return_tensors="pt").input_ids
input_ids = input_ids.cuda()

outputs = model.generate(input_ids, do_sample=True, max_new_tokens=1024, top_p=0.85,
    temperature=0.3, repetition_penalty=1.2, eos_token_id=tokenizer.eos_token_id)

input_ids_len = input_ids.size(1)
response_ids = outputs[0][input_ids_len:]
response = tokenizer.decode(response_ids)
print(response)
Downloads last month
1,817
Safetensors
Model size
751M params
Tensor type
F32
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Spaces using WangZeJun/bloom-820m-chat 23