beomi's picture
Update README.md
8ad47b6 verified
---
language:
- ko
- en
license: cc-by-nc-sa-4.0
library_name: transformers
---
# Llama-3-KoEn-8B-xtuner-llava-preview πŸŒ‹
<!-- Provide a quick summary of what the model is/does. -->
Llama-3-KoEn-8B-xtuner-llava-preview πŸŒ‹ is Korean based MutliModal based on Llava architecture, merged with [ChatVector](https://arxiv.org/abs/2310.04799) methods leveraging 2 models:
1) [beomi/Llama-3-KoEn-8B-preview](https://huggingface.co/beomi/Llama-3-KoEn-8B-preview)
2) [xtuner/llava-llama-3-8b-transformers](https://huggingface.co/xtuner/llava-llama-3-8b-transformers)
## Model Details
### Model Description
- **Developed by:** Junbum Lee (Beomi)
- **Model type:** HuggingFace Llava πŸŒ‹
- **Language(s) (NLP):** Korean, English
- **License:** cc-by-nc-sa-4.0 under Llama3 License
- **Merged from model:** [beomi/Llama-3-KoEn-8B-preview](https://huggingface.co/beomi/Llama-3-KoEn-8B-preview) & [xtuner/llava-llama-3-8b-transformers](https://huggingface.co/xtuner/llava-llama-3-8b-transformers)
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
![Cat walking on frozen Han-River, Seoul](https://cdn-uploads.huggingface.co/production/uploads/5e56829137cb5b49818287ea/NWfoArWI4UPAxpEnolkwT.jpeg)
> Two version recommended
>
> v1. `revision='a38aac3'`: Basic ChatVector, with [25B+ trained KoEn ckpt(rev. d4d25a2)](https://huggingface.co/beomi/Llama-3-KoEn-8B-preview/commit/d4d25a2).
>
> v1-1. `revision='0224971'`: Basic ChatVector, with [40B+ trained KoEn ckpt(rev. ad39b32)](https://huggingface.co/beomi/Llama-3-KoEn-8B-preview/commit/ad39b32cd4207f37f61f16e79d3f4020c5b744ef).
>
> v1-2. `revision='170746c'`: Basic ChatVector, with [80B+ trained KoEn ckpt(rev. b4c45ab)](https://huggingface.co/beomi/Llama-3-KoEn-8B-preview/commit/b4c45ab3355c6ccb9bb1ecdf8a75ded4d6620c7e).
>
> v2. `revision='4f04d1e'`: Model diff based merging(ref. https://huggingface.co/blog/maywell/llm-feature-transfer), with [25B+ trained KoEn ckpt(rev. d4d25a2)](https://huggingface.co/beomi/Llama-3-KoEn-8B-preview/commit/d4d25a2).
```python
import requests
from PIL import Image
import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration
model_id = "beomi/Llama-3-KoEn-8B-xtuner-llava-preview"
model = LlavaForConditionalGeneration.from_pretrained(
model_id,
torch_dtype='auto',
device_map='auto',
revision='a38aac3', # 'a38aac3' for basic ChatVector, '4f04d1e' for Model diff based merging(ref. https://huggingface.co/blog/maywell/llm-feature-transfer)
)
processor = AutoProcessor.from_pretrained(model_id)
tokenizer = processor.tokenizer
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
prompt = ("<|start_header_id|>user<|end_header_id|>\n\n<image>\n이 이미지에 λŒ€ν•΄μ„œ μ„€λͺ…ν•΄μ£Όμ„Έμš”.<|eot_id|>"
"<|start_header_id|>assistant<|end_header_id|>\n\n이 μ΄λ―Έμ§€μ—λŠ”")
image_file = "https://cdn-uploads.huggingface.co/production/uploads/5e56829137cb5b49818287ea/NWfoArWI4UPAxpEnolkwT.jpeg"
raw_image = Image.open(requests.get(image_file, stream=True).raw)
inputs = processor(prompt, raw_image, return_tensors='pt').to(0, torch.float16)
output = model.generate(**inputs, max_new_tokens=400, do_sample=True, eos_token_id=terminators,)
print(processor.decode(output[0][2:], skip_special_tokens=False))
# --- Example Output [v1, Chat Vector] ---
user<|end_header_id|>
<image>
이 이미지에 λŒ€ν•΄μ„œ μ„€λͺ…ν•΄μ£Όμ„Έμš”.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
이 μ΄λ―Έμ§€μ—λŠ” 고양이 ν•œ λ§ˆλ¦¬κ°€ κ°•λ¬Ό μœ„λ₯Ό κ±Έμ–΄κ°€λŠ” λͺ¨μŠ΅μ΄ λ³΄μ—¬μ§‘λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” κ°•λ¬Όμ˜ μž”λ¬Όκ²°μ— λ―Έλ„λŸΌμ„ 타고 κ°• κ°€λ‘œλ₯Ό μ§€λ‚˜λŠ” 데 λŠ₯μˆ™ν•˜κ²Œ λ³΄μž…λ‹ˆλ‹€. κ³ μ–‘μ΄μ˜ λ°œμ€ κ°•λ¬Όλ‘œ 잘 λ“€μ–΄κ°€, 그것을 즐기며 κ±Έμ–΄κ°‘λ‹ˆλ‹€.
λ˜ν•œ 이 이미지도 μŒμ„± λ…ΉμŒμ„ ν•˜κ±°λ‚˜ λ…Ήν™”λœ 자료둜 μ œμž‘λ˜μ—ˆμœΌλ©°, 주둜 κ³ μ–‘μ΄μ˜ λͺ¨μŠ΅μ„ κ°•ν•˜κ²Œ λ³΄μ—¬μ€λ‹ˆλ‹€. μ†Œλ¦¬ νš¨κ³Όλ„ μ—¬λŸ¬ κ°€μ§€λ‘œ μΆ”κ°€ν•˜μ—¬ κ³ μ–‘μ΄μ˜ μŠ€ν† λ¦¬λ₯Ό λ‹€μ–‘ν•˜κ²Œ μ „λ‹¬ν•©λ‹ˆλ‹€. 강물은 μž”λ¬Όκ²°μ„ λ‚˜νƒ€λ‚΄λ©° κ°•λ¬Ό μœ„λ₯Ό κ±·λŠ” κ³ μ–‘μ΄μ˜ λͺ¨μŠ΅μ„ λ”μš± κ°•λ ¬ν•˜κ²Œ κ°•μ‘°ν•˜κΈ° μœ„ν•΄ μž”λ¬Όκ²°μ„ 톡해 더 λ””ν…ŒμΌν•œ μž₯면을 λ³΄μ—¬μ€λ‹ˆλ‹€.<|eot_id|>
# --- Example Output [v1-1, Chat Vector] ---
user<|end_header_id|>
<image>
이 이미지에 λŒ€ν•΄μ„œ μ„€λͺ…ν•΄μ£Όμ„Έμš”.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
이 μ΄λ―Έμ§€μ—μ„œλŠ” ν•œ 고양이가 μ„œν•΄μ•ˆμ— μœ„μΉ˜ν•œ λ°”λ‹€λ₯Ό κ±·κ³  μžˆλŠ” λͺ¨μŠ΅μ„ λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” ν•΄λ³€μ—μ„œλΆ€ν„° λ°”λ‹€λ‘œ κ±Έμ–΄λ“€μ–΄κ°€λŠ” 쀑이며, μ£Όλ³€μ—λŠ” μž”μž”ν•œ νŒŒλ„κ°€ λ°€λ €μ˜€λŠ” λͺ¨μŠ΅μ„ 보여주고 μžˆμŠ΅λ‹ˆλ‹€. 이 κ³ μ–‘μ΄λŠ” νƒœμ–΄λ‚  λ•ŒλΆ€ν„° 고양이와 κ°•μ•„μ§€μ™€λŠ” λ‹€λ₯΄κ²Œ λ°”λ‹€λ₯Ό κ²½ν—˜ν•˜κ³ , 적응해가고 μžˆμŠ΅λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” λ°”λ‹€λ₯Ό μ’‹μ•„ν•˜κ³ , 이 ν™˜κ²½μ—μ„œ 행볡을 λŠλΌλŠ” 것 κ°™μŠ΅λ‹ˆλ‹€. 이 κ³ μ–‘μ΄λŠ” 인간이 μ•„λ‹Œ μžμ—°μ˜ μΌλΆ€λ‘œμ¨ 이 ν™˜κ²½μ—μ„œ μ‚΄μ•„κ°€κ³  μžˆμŠ΅λ‹ˆλ‹€.<|eot_id|>
# --- Example Output [v1-2, Chat Vector] ---
# model.generate(**inputs, max_new_tokens=200, do_sample=True, top_p=0.7, eos_token_id=terminators,)
user<|end_header_id|>
<image>
이 이미지에 λŒ€ν•΄μ„œ μ„€λͺ…ν•΄μ£Όμ„Έμš”.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
이 μ΄λ―Έμ§€λŠ” ν•œ 고양이가 λ¬Ό μœ„λ₯Ό κ±·κ³  μžˆλŠ” λͺ¨μŠ΅μ„ ν¬μ°©ν•œ μ‚¬μ§„μž…λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” 두 발둜 λ¬Ό μœ„λ₯Ό κ±Έμ–΄ κ°€κ³  μžˆμŠ΅λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” 4개의 발 쀑 2개의 λ°œμ€ 물에 빠지지 μ•Šκ³  2개의 λ°œμ€ 물에 λΉ μ Έ μžˆμŠ΅λ‹ˆλ‹€. κ³ μ–‘μ΄μ˜ 발이 빠진 뢀뢄은 λ°˜μ˜λ˜μ–΄ 물에 비쳐 μžˆμŠ΅λ‹ˆλ‹€. λ¬Ό μœ„λ₯Ό κ±·λŠ” κ³ μ–‘μ΄μ˜ λͺ¨μŠ΅μ΄ 참으둜 κ·€μ—½κ³  μ‚¬λž‘μŠ€λŸ½μŠ΅λ‹ˆλ‹€. 이 사진은 KBS λ™λ¬Όμ˜ μ™•κ΅­μ—μ„œ λ°©μ˜λ˜μ—ˆμŠ΅λ‹ˆλ‹€. KBS λ™λ¬Όμ˜ 왕ꡭ은 1985λ…„λΆ€ν„° μ‹œμž‘ν•˜μ—¬ 2019λ…„κΉŒμ§€ 34λ…„ λ™μ•ˆ 방영된 KBS의 λŒ€ν‘œμ μΈ μžμ—° λ‹€νλ©˜ν„°λ¦¬ ν”„λ‘œκ·Έλž¨μž…λ‹ˆλ‹€. KBS λ™λ¬Όμ˜ 왕ꡭ은 λ™λ¬Όμ˜ μƒνƒœμ™€ μŠ΅μ„±, 행동, 그리고 μžμ—° ν™˜κ²½μ„ μ΄ν•΄ν•˜κ³  λ³΄ν˜Έν•˜λŠ” 데 κΈ°μ—¬ν•˜κ³ μž ν•©λ‹ˆλ‹€.
# --- Example Output [v2, Model diff based merging] ---
user<|end_header_id|>
<image>
이 이미지에 λŒ€ν•΄μ„œ μ„€λͺ…ν•΄μ£Όμ„Έμš”.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
이 μ΄λ―Έμ§€μ—λŠ” ν•œκ΅­μ–΄ μžλ§‰κ³Ό ν•¨κ»˜ 고양이가 물에 λ°œμ„ λ””λ””κ³  κ±·λŠ” λͺ¨μŠ΅μ΄ 담겨 μžˆμŠ΅λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” 였λ₯Έμͺ½ λ°œμ„ 물에 λ‹΄κ·Έκ³  κ±·λŠ” 쀑이며, ν•œκ΅­μ–΄ μžλ§‰μ€ "κ³ μ–‘μ΄λŠ” 물을 μ’‹μ•„ν•©λ‹ˆλ‹€"λΌλŠ” λ¬Έμž₯을 ν¬ν•¨ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. 이 μžλ§‰μ€ 고양이가 물을 μ’‹μ•„ν•˜λŠ” 것을 κ°•μ‘°ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.<|eot_id|>
```