GGUF
Chinese
Inference Endpoints
conversational

Model Card for Llama 3 8B Instruct (Quantized to 4-bit)

This model is a fine-tuned version of Llama 3 8B Instruct, quantized to 4-bit, using the Chinese datasets YeungNLP/firefly-train-1.1M and LooksJuicy/ruozhiba.

Model Details

Model Description

  • Developed by: Zane
  • Model type: Llama 3 8B Instruct (Quantized to 4-bit)
  • Language(s) (NLP): Chinese (zh)
  • License: Apache-2.0

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "your-username/llama-3-8b-instruct-4bit-chinese"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

input_text = "请输入您的中文文本"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(inputs.input_ids, max_length=50)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
Downloads last month
6
GGUF
Model size
8.03B params
Architecture
llama

4-bit

Inference API
Unable to determine this model's library. Check the docs .

Datasets used to train Zane666/ruozhi-Llamma3-8b-unsloth-q4