OFA-OCR / checkpoints_cn.md
JustinLin610's picture
first commit
ee21b96

Checkpoints (OFA-CN)

We provide checkpoints of OFA-CN, which is the Chinese version of OFA. We provide Base-size and Large-size models, including pretrained and finetuned models on image captioning and referring expression comprehension. Note that we translated the texts in the RefCOCO(-/+/g) datasets and finetuned OFA-CN on them. We plan to release the related new datasets in the near future.

Checkpoints

Below we provide the links for downloading the Chinese OFA checkpoints.

Pretraining

Finetuning (OFA-Large)

Finetuning (OFA-Base)

Model Card

Below we provide the basic information of the base-size and large-size OFA-CN.

Model#ParamsBackboneHidden SizeIntermediate Size#Heads#Enc. Layers#Dec. Layers
OFABase160MResNet10176830721266
OFALarge443MResNet15210244096161212

Results

Below we provide the results of OFA-CN and the baselines for comparison.

MUGE Caption

ModelBLEU@4ROUGE-LCIDEr-D
Trm 7.3351.5111.00
M616.1955.0630.75
OFABase26.2358.9550.70
OFALarge27.3259.2053.51

RefCOCO-CN Series

ModelRefCOCO(val/testA/testB)RefCOCO+(val/testA/testB)RefCOCOg(val/test-u)
OFABase(random-init)30.13/35.07/25.0317.89/20.90/15.8320.30/20.45
OFABase82.18/86.07/76.6869.38/77.26/60.1473.57/72.53
OFALarge82.84/86.54/76.5071.30/78.56/61.8571.96/71.30