File size: 1,380 Bytes
0a3556c
 
 
 
8dde024
0a3556c
 
 
7d56463
0a3556c
87b6f9e
5be2df9
87b6f9e
0a3556c
 
 
 
 
67327d8
0a3556c
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
---
tags:
- multimodal-entailment
- generic

---
## Tensorflow Keras Implementation of Multimodal entailment.

This repo contains the models [Multimodal Entailment](https://keras.io/examples/nlp/multimodal_entailment/#dataset-visualization).

Credits: [Sayak Paul](https://twitter.com/RisingSayak) - Original Author

HF Contribution: [Rishav Chandra Varma](https://huggingface.co/reichenbach)  

## Background Information

### Introduction

In this example, we will build and train a model for predicting multimodal entailment. We will be using the [multimodal entailment dataset](https://github.com/google-research-datasets/recognizing-multimodal-entailment) recently introduced by Google Research.

### What is multimodal entailment?

On social media platforms, to audit and moderate content we may want to find answers to the following questions in near real-time:

    Does a given piece of information contradict the other?
    Does a given piece of information imply the other?

In NLP, this task is called analyzing textual entailment. However, that's only when the information comes from text content. In practice, it's often the case the information available comes not just from text content, but from a multimodal combination of text, images, audio, video, etc. Multimodal entailment is simply the extension of textual entailment to a variety of new input modalities.