ChatGLM2 6b int8 ้‡ๅŒ–ๆจกๅž‹

่ฏฆๆƒ…ๅ‚่€ƒ K024/chatglm-qใ€‚

See K024/chatglm-q for more details.

import torch
from chatglm_q.decoder import ChatGLMDecoder, chat_template

device = torch.device("cuda")
decoder = ChatGLMDecoder.from_pretrained("K024/chatglm2-6b-int8", device=device)

prompt = chat_template([], "ๆˆ‘ๆ˜ฏ่ฐ๏ผŸ")
for text in decoder.generate(prompt):
    print(text)

ๆจกๅž‹ๆƒ้‡ๆŒ‰ ChatGLM2-6b ่ฎธๅฏๅ‘ๅธƒ๏ผŒ่ง MODEL LICENSEใ€‚

Model weights are released under the same license as ChatGLM2-6b, see MODEL LICENSE.

Downloads last month
47
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.