|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- eltorio/ROCO-radiology |
|
language: |
|
- en |
|
- fr |
|
base_model: |
|
- HuggingFaceM4/Idefics3-8B-Llama3 |
|
pipeline_tag: image-text-to-text |
|
library_name: peft |
|
--- |
|
|
|
# IDEFICS3_ROCO |
|
|
|
[](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) |
|
|
|
## Star the project |
|
|
|
**If you appreciate my work, please consider giving it a star! 🤩** |
|
**I'm also looking for donations of free GPU time to complete the fine-tuning process.** |
|
**Please contact me if you can help! 🙏** |
|
|
|
## A Fine-tuned Radiology-focused Model based on Hugging Face's Idefics3 Model |
|
|
|
This repository contains a fine-tuned version of the Hugging Face [Idefics3-8B-Llama3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3) model, built on top of the Meta Llama 3.1 8B architecture. Our model, `IDEFICS3_ROCO`, has been fine-tuned on the [Radiology Objects in Context (ROCO)](https://huggingface.co/datasets/eltorio/ROCO-radiology) dataset, a large-scale medical and multimodal imaging collection. |
|
|
|
## TL;DR |
|
|
|
For immediate use, you can load the model directly from Hugging Face: |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForImageTextToText |
|
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu') |
|
model = AutoModelForImageTextToText.from_pretrained("eltorio/IDEFICS3_ROCO").to(device) |
|
``` |
|
|
|
### Model Information |
|
|
|
* **Base Model:** Idefics3-8B-Llama3 |
|
* **Fine-tuning Dataset:** Radiology Objects in Context (ROCO) |
|
* **License:** Apache-2.0 |
|
* **Current Status:** Fine-tuning process is finished. Contributions to complete the fine-tuning / vallidation / test processes are welcome! |
|
|
|
### Training Progress Status |
|
|
|
* Current checkpoint: 12267 (100% completed) |
|
* Estimated remaining GPU time: 0 hours |
|
* Hardware requirements: T4 GPU with >16GB VRAM |
|
* Last update: november, 12th 2024 |
|
|
|
### Fine-tuning Code |
|
|
|
The fine-tuning code is available as a Jupyter Notebook in the [ROCO-radiology dataset repository](https://huggingface.co/datasets/eltorio/ROCO-radiology) on Hugging Face: |
|
|
|
* [ROCO-idefics3.ipynb](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) |
|
|
|
The [Junyper Notebook](https://colab.research.google.com/#fileId=https%3A//huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) [](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) contains the code to fine-tune the Idefics3-8B-Llama3 model on the ROCO dataset. The fine-tuning process is currently halted at checkpoint 640 (out of 24,000) due to limitations with Colab Free T4 GPU unit. Contributions to complete the fine-tuning process are welcome! |
|
|
|
### Contributions Welcome |
|
|
|
If you have the resources to complete the fine-tuning process, we would appreciate your contribution. Please fork this repository, finish the fine-tuning process, and submit a pull request with your updates. |
|
|
|
### Citation |
|
|
|
If you use this model in your work, please cite the original Idefics3 model and our fine-tuned model: |
|
|
|
* [Idefics3-8B-Llama3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3) |
|
* [IDEFICS3_ROCO](https://huggingface.co/eltorio/IDEFICS3_ROCO) |
|
|
|
### Contribution Guide |
|
|
|
1. **Technical Requirements** |
|
* Access to powerful GPU (T4, V100, A100 or equivalent) |
|
* Python environment with PyTorch |
|
* Disk space: ~100GB |
|
|
|
2. **Getting Started** |
|
* Fork the repository |
|
* Resume from checkpoint 12267 |
|
* Follow instructions in [ROCO-idefics3.ipynb](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) [](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) |
|
|
|
3. **Contact** |
|
* For questions: [link to issues/discussions](https://huggingface.co/eltorio/IDEFICS3_ROCO/discussions) |
|
|
|
### Docker Image |
|
|
|
A AI training docker image is available for this model. The image and includes all necessary dependencies to run the fine-tuning process. |
|
You need to set the `HF_TOKEN` environment variable to your Hugging Face API token. |
|
You also need to have NVidia Docker container runtime installed. |
|
Finnaly, you need to run the container with GPU support with `--gpus all` option. |
|
The image is available on Docker Hub: |
|
|
|
```bash |
|
export HF_TOKEN=hf_some_token |
|
docker run --gpus all --user=42420:42420 -e HF_TOKEN=$HF_TOKEN -it sctg/roco-idefics3:latest bash -i /start.sh $HF_TOKEN |
|
``` |
|
|
|
The Dockerfile is available in the [IDEFICS_ROCO repository](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/Dockerfile). |
|
|
|
### Acknowledgments |
|
|
|
This work was made possible by the [Hugging Face Transformers](https://huggingface.co/) library and the [ROCO-radiology dataset](https://huggingface.co/datasets/eltorio/ROCO-radiology). |