ziqingyang commited on
Commit
41e7a27
1 Parent(s): 337fe46

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -4,7 +4,9 @@ language:
4
  - en
5
  ---
6
 
7
- **VLE** (**V**isual-**L**anguage **E**ncoder) is an image-text multimodal understanding model built on the pre-trained text and image encoders. It can be used for multimodal discriminative tasks such as visual question answering and image-text retrieval. Especially on the visual commonsense reasoning (VCR) task, which requires high-level language understanding and reasoning skills, VLE achieves the best performance among the public methods.
 
 
8
 
9
  For more details see [https://github.com/iflytek/VLE](https://github.com/iflytek/VLE).
10
 
4
  - en
5
  ---
6
 
7
+ **VLE** (**V**isual-**L**anguage **E**ncoder) is an image-text multimodal understanding model built on the pre-trained text and image encoders.
8
+ It can be used for multimodal discriminative tasks such as visual question answering and image-text retrieval.
9
+ Especially on the visual commonsense reasoning (VCR) task, which requires high-level language understanding and reasoning skills, VLE achieves significant improvements.
10
 
11
  For more details see [https://github.com/iflytek/VLE](https://github.com/iflytek/VLE).
12