Llama-3.2-11B X 射线分析模型 (v1)

该模型是 unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit 的微调版本,专为分析医学放射影像(X 射线、CT 扫描、超声)而设计。它在 ROCO 放射数据集 (unsloth/Radiology_mini) 的一个子集上进行训练,以生成图像的描述,充当专业的放射技师。

训练

该模型使用 Unsloth 进行微调,以实现 2 倍更快的训练速度并减少内存使用。关键训练细节:

  • 基础模型: unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit
  • 数据集: unsloth/Radiology_mini
  • LoRA 适配器: 使用了参数高效的微调,仅训练模型参数的一小部分。
    • r = 16
    • lora_alpha = 16
    • lora_dropout = 0
  • 训练超参数:
    • learning_rate = 2e-4
    • per_device_train_batch_size = 2
    • gradient_accumulation_steps = 4
    • max_steps = 30 (或 num_train_epochs = 1 用于完整运行)
    • optimizer = adamw_8bit
  • 微调配置:
    • finetune_vision_layers = False
    • finetune_language_layers = True
    • finetune_attention_modules = True
    • finetune_mlp_modules = True

用法

该模型设计为将图像和文本提示作为输入,并生成文本描述。预期的输入格式是一个对话,如下所示:

from unsloth import FastVisionModel
import torch
from transformers import TextStreamer

# 加载模型 (如果从本地加载,请将 "YLX1965/llama3-2-11b-xray-v1" 替换为您的模型路径)
model, tokenizer = FastVisionModel.from_pretrained(
    "YLX1965/llama3-2-11b-xray-v1",
    load_in_4bit = True,  # 对于 16 位加载,设置为 False
)
FastVisionModel.for_inference(model)

# 示例图像和指令 (替换为您的图像路径)
# from datasets import load_dataset
# dataset = load_dataset("unsloth/Radiology_mini", split="train")
# image = dataset[0]["image"]

image = "path/to/your/image.jpg" # 示例

instruction = "你是一位专业的放射科医生。请准确描述您在这张图片中看到的内容。"

messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens = False,
    return_tensors = "pt",
).to("cuda")

text_streamer = TextStreamer(tokenizer, skip_prompt = True)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128,
                  use_cache = True, temperature = 1.5, min_p = 0.1)
Downloads last month
57
Safetensors
Model size
10.7B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support image-text-to-text models for diffusers library.

Model tree for YLX1965/llama3-2-11b-xray-v1

Dataset used to train YLX1965/llama3-2-11b-xray-v1