Original model card

Buy me a coffee if you like this project ;)

Description

GGML Format model files for This project.

inference


import ctransformers

from ctransformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(output_dir, ggml_file,
gpu_layers=32, model_type="llama")

manual_input: str = "Tell me about your last dream, please."


llm(manual_input, 
      max_new_tokens=256, 
      temperature=0.9, 
      top_p= 0.7)

Original model card

A bilingual instruction-tuned LoRA model of https://huggingface.co/baichuan-inc/baichuan-7B

Please follow the baichuan-7B License to use this model.

Usage:

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

tokenizer = AutoTokenizer.from_pretrained("hiyouga/baichuan-7b-sft", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("hiyouga/baichuan-7b-sft", trust_remote_code=True).cuda()
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

query = "晚上睡不着怎么办"
template = (
    "A chat between a curious user and an artificial intelligence assistant. "
    "The assistant gives helpful, detailed, and polite answers to the user's questions.\n"
    "Human: {}\nAssistant: "
)

inputs = tokenizer([template.format(query)], return_tensors="pt")
inputs = inputs.to("cuda")
generate_ids = model.generate(**inputs, max_new_tokens=256, streamer=streamer)

You could also alternatively launch a CLI demo by using the script in https://github.com/hiyouga/LLaMA-Efficient-Tuning

python src/cli_demo.py --template default --model_name_or_path hiyouga/baichuan-7b-sft

You could reproduce our results with the following scripts using LLaMA-Efficient-Tuning:

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --model_name_or_path baichuan-inc/baichuan-7B \
    --do_train \
    --dataset alpaca_gpt4_en,alpaca_gpt4_zh,codealpaca \
    --template default \
    --finetuning_type lora \
    --lora_rank 16 \
    --lora_target W_pack,o_proj,gate_proj,down_proj,up_proj \
    --output_dir baichuan_lora \
    --overwrite_cache \
    --per_device_train_batch_size 8 \
    --per_device_eval_batch_size 8 \
    --gradient_accumulation_steps 8 \
    --preprocessing_num_workers 16 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 100 \
    --eval_steps 100 \
    --learning_rate 5e-5 \
    --max_grad_norm 0.5 \
    --num_train_epochs 2.0 \
    --dev_ratio 0.01 \
    --evaluation_strategy steps \
    --load_best_model_at_end \
    --plot_loss \
    --fp16

Loss curve on training set: train

Loss curve on evaluation set: eval

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.