Llama3-Chat_Vector-kor_llava

I have implemented a Korean LLAVA model referring to the models created by Beomi, who made the Korean Chat Vector LLAVA model, and Toshi456, who made the Japanese Chat Vector LLAVA model.

Reference Models:

  1. beomi/Llama-3-KoEn-8B-xtuner-llava-preview(https://huggingface.co/beomi/Llama-3-KoEn-8B-xtuner-llava-preview)
  2. toshi456/chat-vector-llava-v1.5-7b-ja(https://huggingface.co/toshi456/chat-vector-llava-v1.5-7b-ja)
  3. xtuner/llava-llama-3-8b-transformers

Citation

@misc {Llama3-Chat_Vector-kor_llava,
    author       = { {nebchi} },
    title        = { Llama3-Chat_Vector-kor_llava },
    year         = 2024,
    url          = { https://huggingface.co/nebchi/Llama3-Chat_Vector-kor_llava },
    publisher    = { Hugging Face }
}

Seoul City

Running the model on GPU

import requests
from PIL import Image

import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration, TextStreamer

model_id = "nebchi/Llama3-Chat_Vector-kor_llava"

model = LlavaForConditionalGeneration.from_pretrained(
    model_id, 
    torch_dtype='auto', 
    device_map='auto',
    revision='a38aac3', 
)

processor = AutoProcessor.from_pretrained(model_id)

tokenizer = processor.tokenizer
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
streamer = TextStreamer(tokenizer)

prompt = ("<|start_header_id|>user<|end_header_id|>\n\n<image>\n이 이미지에 λŒ€ν•΄μ„œ μ„€λͺ…ν•΄μ£Όμ„Έμš”.<|eot_id|>"
          "<|start_header_id|>assistant<|end_header_id|>\n\n이 μ΄λ―Έμ§€μ—λŠ”")
image_file = "https://search.pstatic.net/common/?src=http%3A%2F%2Fimgnews.naver.net%2Fimage%2F5582%2F2018%2F04%2F20%2F0000001323_001_20180420094641826.jpg&type=sc960_832"

raw_image = Image.open(requests.get(image_file, stream=True).raw)
inputs = processor(prompt, raw_image, return_tensors='pt').to(0, torch.float16)

output = model.generate(
    **inputs,
    max_new_tokens=512,
    do_sample=True,  
    eos_token_id=terminators,
    no_repeat_ngram_size=3, 
    temperature=0.7,  
    top_p=0.9,  
    streamer=streamer
)
print(processor.decode(output[0][2:], skip_special_tokens=False))

results

이 μ΄λ―Έμ§€μ—λŠ” λ„μ‹œμ˜ λͺ¨μŠ΅μ΄ 잘 λ³΄μ—¬μ§‘λ‹ˆλ‹€. λ„μ‹œ λ‚΄λΆ€μ—λŠ” μ—¬λŸ¬ 건물과 건물듀이 있고, λ„μ‹œλ₯Ό μ—°κ²°ν•˜λŠ” λ„λ‘œμ™€ ꡐ톡 μ‹œμŠ€ν…œμ΄ 잘 λ°œλ‹¬λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. 이 λ„μ‹œμ˜ νŠΉμ§•μ€ λ†’κ³  κ΄‘λ²”μœ„ν•œ 건물듀과 ꡐ톡망을 κ°–μΆ˜ 것이 μ’‹μŠ΅λ‹ˆλ‹€.
Downloads last month
25
Safetensors
Model size
8.36B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support