IDEFICS3_ROCO / README.md

æLtorio

update tl;dr

5cbfb5c unverified 4 months ago

5.21 kB

	---
	license: apache-2.0
	datasets:
	- eltorio/ROCO-radiology
	language:
	- en
	- fr
	base_model:
	- HuggingFaceM4/Idefics3-8B-Llama3
	pipeline_tag: image-text-to-text
	library_name: peft
	---

	# IDEFICS3_ROCO

	![Stage](https://img.shields.io/badge/stage-early%20development-yellow)![License](https://img.shields.io/badge/license-Apache%202.0-blue)![Contributors Welcome](https://img.shields.io/badge/contributors-welcome-brightgreen)[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb)

	## Star the project

	If you appreciate my work, please consider giving it a star! 🤩
	I'm also looking for donations of free GPU time to complete the fine-tuning process.
	Please contact me if you can help! 🙏

	## A Fine-tuned Radiology-focused Model based on Hugging Face's Idefics3 Model

	This repository contains a fine-tuned version of the Hugging Face [Idefics3-8B-Llama3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3) model, built on top of the Meta Llama 3.1 8B architecture. Our model, `IDEFICS3_ROCO`, has been fine-tuned on the [Radiology Objects in Context (ROCO)](https://huggingface.co/datasets/eltorio/ROCO-radiology) dataset, a large-scale medical and multimodal imaging collection.

	## TL;DR

	For immediate use, you can load the model directly from Hugging Face:

	```python
	import torch
	from transformers import AutoModelForImageTextToText
	device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
	model = AutoModelForImageTextToText.from_pretrained("eltorio/IDEFICS3_ROCO").to(device)
	```

	### Model Information

	* Base Model: Idefics3-8B-Llama3
	* Fine-tuning Dataset: Radiology Objects in Context (ROCO)
	* License: Apache-2.0
	* Current Status: Fine-tuning process is finished. Contributions to complete the fine-tuning / vallidation / test processes are welcome!

	### Training Progress Status

	* Current checkpoint: 12267 (100% completed)
	* Estimated remaining GPU time: 0 hours
	* Hardware requirements: T4 GPU with >16GB VRAM
	* Last update: november, 12th 2024

	### Fine-tuning Code

	The fine-tuning code is available as a Jupyter Notebook in the [ROCO-radiology dataset repository](https://huggingface.co/datasets/eltorio/ROCO-radiology) on Hugging Face:

	* [ROCO-idefics3.ipynb](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb)

	The [Junyper Notebook](https://colab.research.google.com/#fileId=https%3A//huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) contains the code to fine-tune the Idefics3-8B-Llama3 model on the ROCO dataset. The fine-tuning process is currently halted at checkpoint 640 (out of 24,000) due to limitations with Colab Free T4 GPU unit. Contributions to complete the fine-tuning process are welcome!

	### Contributions Welcome

	If you have the resources to complete the fine-tuning process, we would appreciate your contribution. Please fork this repository, finish the fine-tuning process, and submit a pull request with your updates.

	### Citation

	If you use this model in your work, please cite the original Idefics3 model and our fine-tuned model:

	* [Idefics3-8B-Llama3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3)
	* [IDEFICS3_ROCO](https://huggingface.co/eltorio/IDEFICS3_ROCO)

	### Contribution Guide

	1. Technical Requirements
	* Access to powerful GPU (T4, V100, A100 or equivalent)
	* Python environment with PyTorch
	* Disk space: ~100GB

	2. Getting Started
	* Fork the repository
	* Resume from checkpoint 12267
	* Follow instructions in [ROCO-idefics3.ipynb](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb)

	3. Contact
	* For questions: [link to issues/discussions](https://huggingface.co/eltorio/IDEFICS3_ROCO/discussions)

	### Docker Image

	A AI training docker image is available for this model. The image and includes all necessary dependencies to run the fine-tuning process.
	You need to set the `HF_TOKEN` environment variable to your Hugging Face API token.
	You also need to have NVidia Docker container runtime installed.
	Finnaly, you need to run the container with GPU support with `--gpus all` option.
	The image is available on Docker Hub:

	```bash
	export HF_TOKEN=hf_some_token
	docker run --gpus all --user=42420:42420 -e HF_TOKEN=$HF_TOKEN -it sctg/roco-idefics3:latest bash -i /start.sh $HF_TOKEN
	```

	The Dockerfile is available in the [IDEFICS_ROCO repository](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/Dockerfile).

	### Acknowledgments

	This work was made possible by the [Hugging Face Transformers](https://huggingface.co/) library and the [ROCO-radiology dataset](https://huggingface.co/datasets/eltorio/ROCO-radiology).