Model Card for Model ID

This PEFT weight is used to specify McDonald's invoice data in Taiwan to gain insights.

Disclaimer: This model is for a time series problem on LLM performance, and it's not for investment advice; any prediction results are not a basis for investment reference.

Model Details

Model Description

This repo contains QLoRA format model files for Meta's Llama 2 7B-chat.

Uses

import torch
from peft import LoraConfig, PeftModel

from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    TextStreamer,
    pipeline,
    logging,
)

device_map = {"": 0}
use_4bit = True
bnb_4bit_compute_dtype = "float16"
bnb_4bit_quant_type = "nf4"
use_nested_quant = False
compute_dtype = getattr(torch, bnb_4bit_compute_dtype)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=use_4bit,
    bnb_4bit_quant_type=bnb_4bit_quant_type,
    bnb_4bit_compute_dtype=compute_dtype,
    bnb_4bit_use_double_quant=use_nested_quant,
)

based_model_path = "DavidLanz/Llama2-tw-7B-v2.0.1-chat"
adapter_path = "DavidLanz/llama2_7b_taiwan_invoice_qlora"

base_model = AutoModelForCausalLM.from_pretrained(
    based_model_path,
    low_cpu_mem_usage=True,
    # load_in_4bit=True,
    return_dict=True,
    quantization_config=bnb_config,
    torch_dtype=torch.float16,
    device_map=device_map,
)
model = PeftModel.from_pretrained(base_model, adapter_path)

tokenizer = AutoTokenizer.from_pretrained(base_model_path, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

from transformers import pipeline

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, torch_dtype=torch.bfloat16, device_map="auto")
messages = [
    {
        "role": "system",
        "content": "你是分析麥當勞速食店發票的專家。你已有 2024 年 2 月份的詳細發票資料，其中包含消費者 ID、消費者購買的商品項目、總金額、顧客的性別與年齡等人口統計資料，以及購買日期及時間。根據這些發票數據，請提供見解或預測顧客行為的趨勢。",
    },
    {"role": "user", "content": "2024年2月麥當勞在台北市中正區最熱賣的商品是什麼?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Training procedure

The following bitsandbytes quantization config was used during training:

quant_method: bitsandbytes
load_in_8bit: False
load_in_4bit: True
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: False
bnb_4bit_compute_dtype: float16

Framework versions

PEFT 0.10

DavidLanz
/

llama2_7b_taiwan_invoice_qlora

Model Card for Model ID

Model Details

Model Description

Uses

Training procedure

Framework versions

Adapter for

Model Card for Model ID

Model Details

Model Description

Uses

Training procedure

Framework versions

Adapter for DavidLanz/Llama2-tw-7B-v2.0.1-chat

Adapter for