Edit model card

Original model card

Buy me a coffee if you like this project ;)

Description

GGML Format model files for This project.

inference


import ctransformers

from ctransformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(output_dir, ggml_file,
gpu_layers=32, model_type="llama")

manual_input: str = "Tell me about your last dream, please."


llm(manual_input, 
      max_new_tokens=256, 
      temperature=0.9, 
      top_p= 0.7)

Original model card

A bilingual instruction-tuned LoRA model of https://huggingface.co/baichuan-inc/Baichuan-13B-Base

Usage:

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

tokenizer = AutoTokenizer.from_pretrained("hiyouga/baichuan-13b-sft", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("hiyouga/baichuan-13b-sft", trust_remote_code=True).cuda()
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

query = "晚上睡不着怎么办"
template = (
    "A chat between a curious user and an artificial intelligence assistant. "
    "The assistant gives helpful, detailed, and polite answers to the user's questions.\n"
    "Human: {}\nAssistant: "
)

inputs = tokenizer([template.format(query)], return_tensors="pt")
inputs = inputs.to("cuda")
generate_ids = model.generate(**inputs, max_new_tokens=256, streamer=streamer)

You could also alternatively launch a CLI demo by using the script in https://github.com/hiyouga/LLaMA-Efficient-Tuning

python src/cli_demo.py --template default --model_name_or_path hiyouga/baichuan-13b-sft

You can reproduce our results by visiting the following step-by-step (Chinese) guide:

https://zhuanlan.zhihu.com/p/645010851

or using the following scripts in LLaMA-Efficient-Tuning:

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --model_name_or_path baichuan-inc/Baichuan-13B-Base \
    --do_train \
    --dataset alpaca_gpt4_en,alpaca_gpt4_zh,self_cognition,oaast_sft,sharegpt_zh,refgpt_zh_p1,lima,self_cognition \
    --template default \
    --finetuning_type lora \
    --lora_rank 32 \
    --lora_target W_pack,o_proj,down_proj,up_proj,gate_proj \
    --output_dir baichuan_13b_lora \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 8 \
    --preprocessing_num_workers 16 \
    --max_source_length 1024 \
    --max_target_length 512 \
    --optim paged_adamw_32bit \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 100 \
    --eval_steps 100 \
    --warmup_steps 100 \
    --learning_rate 3e-5 \
    --max_grad_norm 0.5 \
    --num_train_epochs 2.0 \
    --quantization_bit 4 \
    --plot_loss \
    --fp16

Loss curve:

loss


Examples:

Question Baichuan-13b-sft ChatGPT
你是谁 我是 <NAME>,一款由 <AUTHOR> 开发的人工智能助手。我可以回答各种问题,提供帮助和支持。 我是ChatGPT,一个由OpenAI开发的人工智能语言模型。我基于GPT-3.5架构训练,旨在回答各种问题和提供帮助。有什么我可以帮助你的吗?
中国第三代战斗机的代表是什么?
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.