--- license: mit datasets: - m-a-p/COIG-CQIA language: - zh - en metrics: - accuracy pipeline_tag: text2text-generation tags: - finance - legal - medical - code - biology --- # Model Summary Llama3-8B-COIG-CQIA is an instruction-tuned language model for Chinese & English users with various abilities such as roleplaying & tool-using built upon the Meta-Llama-3-8B-Instruct model. Developed by: [Wenfeng Qiu](https://github.com/summit4you) - License: [Llama-3 License](https://llama.meta.com/llama3/license/) - Base Model: Meta-Llama-3-8B-Instruct - Model Size: 8.03B - Context length: 8K # 1. Introduction Training framework: [unsloth](https://github.com/unslothai/unsloth). Training details: - epochs: 1 - learning rate: 2e-4 - learning rate scheduler type: linear - warmup steps: 5 - cutoff len (i.e. context length): 2048 - global batch size: 2 - fine-tuning type: full parameters - optimizer: adamw_8bit # 2. Usage Inference, use to `llama.cpp` or a UI based system like `GPT4All`. You can install GPT4All by going [here](https://gpt4all.io/index.html). Here is the example in `llama.cpp`. ```python from llama_cpp import Llama model = Llama( "/Your/Path/To/Llama3-8B-COIG-CQIA.Q8_0.gguf", verbose=False, n_gpu_layers=-1, ) system_prompt = "You are a helpful assistant." def generate_reponse(_model, _messages, _max_tokens=8192): _output = _model.create_chat_completion( _messages, stop=["<|eot_id|>", "<|end_of_text|>"], max_tokens=_max_tokens, )["choices"][0]["message"]["content"] return _output # The following are some examples messages = [ { "role": "system", "content": system_prompt, }, {"role": "user", "content": "你是谁?"}, ] print(generate_reponse(_model=model, _messages=messages), end="\n\n\n") ```