tarekziade
/

deit-tiny-distilgpt2

vision-encoder-decoder

image-text-to-text

image-captioning

Model card Files Files and versions Community

Variation of https://huggingface.co/tarekziade/distilvit

Trained on 270k images from Flickr10k and COCO. Training source code: https://github.com/tarekziade/distilvit

Results:

eval_loss: 0.2305169701576233
eval_rouge1: 39.511
eval_rouge2: 14.7798
eval_rougeL: 35.9476
eval_rougeLsum: 35.9497
eval_gen_len: 11.695219762592236

Downloads last month: 125

Safetensors

Model size

102M params

Tensor type

F32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tarekziade/deit-tiny-distilgpt2

Base model

distilbert/distilgpt2

Quantized

(16)

this model