File size: 10,261 Bytes

1e8959c
590b14a
9336fcf
1e8959c
590b14a

---
license: apache-2.0
library_name: peft
---

# Food Order Understanding in Korean

This is a LoRA adapter as a result of fine-tuning the pre-trained model `meta-llama/Llama-2-7b-chat-hf`. It is designed with the expectation of understanding Korean food ordering sentences, and analyzing food menus, option names, and quantities.

## Usage

Here is an example of loading the model.
Note the pretrained model is `meta-llama/Llama-2-7b-chat-hf`.

```python
peft_model_id = "jangmin/qlora-llama2-7b-chat-hf-food-order-understanding-30K"

config = PeftConfig.from_pretrained(peft_model_id)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, quantization_config=bnb_config, cache_dir=cache_dir, device_map={"":0})
model = PeftModel.from_pretrained(model, peft_model_id)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path, cache_dir=cache_dir)

model.eval()
```

Inferece can be done as follows.
```python
instruction_prompt_template = """
다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 주문 문장: {0} ### 분석 결과: 
"""
def gen(x):
    q = instruction_prompt_template.format(x)
    gened = model.generate(
        **tokenizer(
            q, 
            return_tensors='pt', 
            return_token_type_ids=False
        ).to('cuda'), 
        max_new_tokens=256,
        early_stopping=True,
        do_sample=True,
        eos_token_id=tokenizer.eos_token_id
    )
    decoded_results = tokenizer.batch_decode(gened, skip_special_tokens=True)
    return decoded_results[0]
```

A generated sample is as follows.
```python
print(gen("아이스아메리카노 톨사이즈 한잔 하고요. 딸기스무디 한잔 주세요. 또, 콜드브루라떼 하나요."))
```
```
다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 명령: 아이스아메리카노 톨사이즈 한잔 하고요. 딸기스무디 한잔 주세요. 또, 콜드브루라떼 하나요. ### 응답:
 - 분석 결과 0: 음식명:아이스아메리카노, 옵션:톨사이즈, 수량:한잔
- 분석 결과 1: 음식명:딸기스무디, 수량:한잔
- 분석 결과 2: 음식명:콜드브루라떼, 수량:하나
``````

More examples are as follows.
```
다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 명령: 오늘은 비가오니깐 이거 먹자. 삼선짬뽕 곱배기 하나하구요, 사천 탕수육 중짜 한그릇 주세요. ### 응답:
 - 분석 결과 0: 음식명:삼선짬뽕,옵션:곱배기,수량:하나
- 분석 결과 1: 음식명:사천 탕수육,옵션:중짜,수량:한그릇

다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 명령: 참이슬 한병, 코카콜라 1.5리터 한병, 테슬라 한병이요. ### 응답:
 - 분석 결과 0: 음식명:참이슬, 수량:한병
- 분석 결과 1: 음식명:코카콜라, 옵션:1.5리터, 수량:한병
- 분석 결과 2: 음식명:테슬라, 수량:한병

다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 명령: 꼬막무침 1인분하고요, 닭도리탕 중자 주세요. 그리고 소주도 한병 주세요. ### 응답:
 - 분석 결과 0: 음식명:꼬막무침,수량:1인분
- 분석 결과 1: 음식명:닭도리탕,옵션:중자
- 분석 결과 2: 음식명:소주,수량:한병

다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 명령: 김치찌개 3인분하고요, 계란말이 주세요. ### 응답:
 - 분석 결과 0: 음식명:김치찌개,수량:3인분
- 분석 결과 1: 음식명:계란말이

다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 명령: 불고기버거세트 1개하고요 감자튀김 추가해주세요. ### 응답:
 - 분석 결과 0: 음식명:불고기버거, 수량:1개
- 분석 결과 1: 음식명:감자튀김

다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 명령: 불닭볶음면 1개랑 사리곰탕면 2개 주세요. ### 응답:
 - 분석 결과 0: 음식명:불닭볶음면, 수량:1개
- 분석 결과 1: 음식명:사리곰탕면, 수량:2개

다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 명령: 카페라떼 아이스 샷추가 한잔하구요. 스콘 하나 주세요 ### 응답:
 - 분석 결과 0: 음식명:카페라떼,옵션:아이스,샷추가,수량:한잔
- 분석 결과 1: 음식명:스콘,수량:하나

다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 명령: 여기요 춘천닭갈비 4인분하고요. 라면사리 추가하겠습니다. 콜라 300ml 두캔주세요. ### 응답:
 - 분석 결과 0: 음식명:춘천닭갈비, 수량:4인분
- 분석 결과 1: 음식명:라면사리
- 분석 결과 2: 음식명:콜라, 옵션:300ml, 수량:두캔

다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.
분석 결과를 완성해주기 바란다.

### 명령: 있잖아요 조랭이떡국 3인분하고요. 떡만두 한세트 주세요. ### 응답:
 - 분석 결과 0: 음식명:조랭이떡국,수량:3인분
- 분석 결과 1: 음식명:떡만두,수량:한세트
```

## Training

Fine-tuning was conducted using https://github.com/artidoro/qlora on an RTX-4090 machine, and took approximately 9 hours. 
The max_steps parameter was set to 5,000, which allowed nearly two complete scans of the entire dataset. 
Below is my training script.
```bash
python qlora.py \
    --cache_dir /Jupyter/huggingface/.cache \
    --model_name_or_path meta-llama/Llama-2-7b-chat-hf \
    --use_auth \
    --output_dir ../output/llama2-gpt4-30k-food-order-understanding-7b \
    --logging_steps 10 \
    --save_strategy steps \
    --data_seed 42 \
    --save_steps 500 \
    --save_total_limit 40 \
    --evaluation_strategy steps \
    --eval_dataset_size 1024 \
    --max_eval_samples 1000 \
    --per_device_eval_batch_size 12 \
    --max_new_tokens 32 \
    --dataloader_num_workers 1 \
    --group_by_length \
    --logging_strategy steps \
    --remove_unused_columns False \
    --do_train \
    --do_eval \
    --lora_r 64 \
    --lora_alpha 16 \
    --lora_modules all \
    --double_quant \
    --quant_type nf4 \
    --bf16 \
    --bits 4 \
    --warmup_ratio 0.03 \
    --lr_scheduler_type constant \
    --gradient_checkpointing \
    --dataset /Jupyter/dev_src/ASR-for-noisy-edge-devices/data/food-order-understanding-gpt4-30k.json \
    --target_max_len 512 \
    --per_device_train_batch_size 12 \
    --gradient_accumulation_steps 1 \
    --max_steps 5000 \
    --eval_steps 500 \
    --learning_rate 0.0002 \
    --adam_beta2 0.999 \
    --max_grad_norm 0.3 \
    --lora_dropout 0.1 \
    --weight_decay 0.0 \
    --seed 0 \
    --report_to tensorboard
```

## Dataset

The dataset was constructed using GPT-API with `gpt-4`. A prompt template is desginged to generate examples of sentence pairs of a food order and its understanding. Total 30k examples were generated.
Note that it cost about $400 to generate 30K examples through 3,000 API calls.

Some generated examples are as follows:

```json
{
  'input': '다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.\n분석 결과를 완성해주기 바란다.\n\n### 명령: 제육볶음 한그릇하고요, 비빔밥 한그릇 추가해주세요. ### 응답:\n',
  'output': '- 분석 결과 0: 음식명:제육볶음,수량:한그릇\n- 분석 결과 1: 음식명:비빔밥,수량:한그릇'
},
{
  'input': '다음은 매장에서 고객이 음식을 주문하는 주문 문장이다. 이를 분석하여 음식명, 옵션명, 수량을 추출하여 고객의 의도를 이해하고자 한다.\n분석 결과를 완성해주기 바란다.\n\n### 명령: 사천탕수육 곱배기 주문하고요, 샤워크림치킨도 하나 추가해주세요. ### 응답:\n',
  'output': '- 분석 결과 0: 음식명:사천탕수육,옵션:곱배기\n- 분석 결과 1: 음식명:샤워크림치킨,수량:하나'
}
```

## Note

I have another fine-tuned Language Model, `jangmin/qlora-polyglot-ko-12.8b-food-order-understanding-32K`, which is based on `EleutherAI/polyglot-ko-12.8b`. The dataset was generated using `gpt-3.5-turbo-16k`. I believe that the quality of a dataset generated by `GPT-4` would be superior to that generated by `GPT-3.5`.