description.md · ms-analytics/vision at main

OFA (Open-domain Feature-based Architecture) is an open-source, unified sequence-to-sequence learning framework developed by Open-domain Feature-based Architecture System (OFA-Sys).

It was released in ICML 2022[1].

OFA is designed to unify modalities and tasks such as:

o image captioning

o VQA, visual grounding

o text-to-image generation

o text classification

o text generation

o image classification

It provides step-by-step instructions for pretraining and finetuning with checkpoints available from both official repository and Hugging Face. [2]

OFA also provides online demos for interactive experience with its pretrained and finetuned models. [3]

Furthermore, it offers Colab notebooks to understand the procedures better. OFA is even more powerful with the release of MuE (Multi-modal Embeddings) which accelerates OFA with little performance degradation. [1]

It also provides OFA-OCR for Chinese text recognition [1] and MMSpeech, an ASR pre-training method based on OFA. [1]

Moreover, a Chinese version of OFA is released. [1]

References:

[1] https://github.com/OFA-Sys/OFA

[2] https://github.com/OFA-Sys

[3] https://github.com/OFA-Sys/OFA/blob/main/colab.md