Image-Text-to-Text
Transformers
Safetensors
Korean
English
internvl_chat
feature-extraction
internvl
mllm
korean
vision-language
conversational
custom_code
Instructions to use yujuyeon/internvl3_5-1b-balanced with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yujuyeon/internvl3_5-1b-balanced with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="yujuyeon/internvl3_5-1b-balanced", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("yujuyeon/internvl3_5-1b-balanced", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use yujuyeon/internvl3_5-1b-balanced with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "yujuyeon/internvl3_5-1b-balanced" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "yujuyeon/internvl3_5-1b-balanced", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/yujuyeon/internvl3_5-1b-balanced
- SGLang
How to use yujuyeon/internvl3_5-1b-balanced with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "yujuyeon/internvl3_5-1b-balanced" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "yujuyeon/internvl3_5-1b-balanced", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "yujuyeon/internvl3_5-1b-balanced" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "yujuyeon/internvl3_5-1b-balanced", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use yujuyeon/internvl3_5-1b-balanced with Docker Model Runner:
docker model run hf.co/yujuyeon/internvl3_5-1b-balanced
internvl3_5-1b-balanced
OpenGVLab/InternVL3_5-1B-Pretrained 를 한국어 멀티모달 데이터로 파인튜닝한 InternVL3.5 specialist.
| 항목 | 값 |
|---|---|
| Base model | OpenGVLab/InternVL3_5-1B-Pretrained |
| Method | Full |
| Domain | 한:영 1:1 균형 |
Hyperparameters
- num_train_epochs:
1 - steps:
5162/ max5162 - train_batch_size:
8 - peak learning_rate:
9.999984252764371e-05
Training Loss
- init 1.4746 → final 0.9904 (min 0.9242)
| step | loss |
|---|---|
| 10 | 1.4746 |
| 520 | 1.0770 |
| 1030 | 1.0217 |
| 1540 | 1.0139 |
| 2050 | 0.9842 |
| 2560 | 0.9938 |
| 3070 | 0.9712 |
| 3580 | 0.9627 |
| 4090 | 0.9475 |
| 4600 | 0.9431 |
| 5110 | 0.9660 |
| 5160 | 0.9904 |
Training Data
구성: 40개 서브셋 (한국어 specialist SFT)
| subset | repeat |
|---|---|
hf_korLlava_Caption |
2.7 |
kr_recap_caption |
1.2857 |
chart2table_en |
0.0475 |
chartRqa1 |
0.3167 |
chartRqa2 |
0.475 |
dvqa_en |
0.0475 |
en_chartqa_chart |
0.5278 |
en_figureqa_chart |
0.475 |
en_mapqa_chart |
1.3571 |
kisti_documen_Reason |
2.25 |
en_imgtext_doc |
0.5625 |
en_vwi_doc |
0.9 |
out_kor_llava |
1.9501 |
en_allava_general |
3.0 |
aihub_mathMultiple |
0.8449 |
aihub_mathSubjective |
1.7883 |
en_geoqa_math |
1.0 |
en_hme_formula |
1.6667 |
en_iconqa_math |
0.6667 |
en_mavis_math |
0.3333 |
hf_latexUpdate |
0.6667 |
aihub_subjectTxt_OCR |
0.625 |
aihub_visual_OCR |
0.8333 |
kisti_arxiv_OCR |
0.8333 |
kopub_SDSKoPub_OCR |
0.3077 |
en_chrome_ocr |
1.4205 |
en_iam_ocr |
2.2093 |
en_llavar_ocr |
0.6316 |
en_ocrvqa_ocr |
0.25 |
aihub_visual_ShortQA |
0.7167 |
kisti_hanbat_Reason |
0.7167 |
kisti_hanbat_Vqa |
0.86 |
en_aokvqa_general |
1.3438 |
aihub_subjectImg_Parse |
1.35 |
en_tabmwp_table |
0.6136 |
en_tatqa_table_text |
1.0385 |
tableVqa_Caption |
0.675 |
tableVqa_Reason |
0.675 |
ko_alpaca_textonly |
0.9006 |
en_evol_textonly |
0.3152 |
Usage
from transformers import AutoModel, AutoTokenizer
import torch
m = AutoModel.from_pretrained("yujuyeon/internvl3_5-1b-balanced", torch_dtype=torch.bfloat16,
trust_remote_code=True).eval().cuda()
tok = AutoTokenizer.from_pretrained("yujuyeon/internvl3_5-1b-balanced", trust_remote_code=True, use_fast=False)
- Downloads last month
- 15
Model tree for yujuyeon/internvl3_5-1b-balanced
Base model
OpenGVLab/InternVL3_5-1B-Pretrained