Diangle commited on
Commit
d26fa24
1 Parent(s): 721dbfe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -7
README.md CHANGED
@@ -20,13 +20,6 @@ For training purposes, a subset consisting of the first 150,000 video-text pairs
20
 
21
  This HF model is based on the [clip-vit-base-patch32](https://huggingface.co/openai/clip-vit-base-patch32) architecture, with weights trained by Daphna Idelson at [Searchium](https://www.searchium.ai).
22
 
23
- ## Motivation
24
-
25
- As per the original authors, the main motivation behind this work is to leverage the power of the CLIP image-language pre-training model and apply it to learning
26
- visual-temporal concepts from videos, thereby improving video-based searches.
27
-
28
- By using the WebVid dataset, the model's capabilities were enhanced even beyond those described in the paper, thanks to the large-scale and diverse nature of the dataset empowering the model's performance.
29
-
30
 
31
  # How to use
32
  ### Extracting Text Embeddings:
@@ -55,6 +48,7 @@ print("sequence_output: ", sequence_output)
55
 
56
  An additional [notebook](https://huggingface.co/Diangle/clip4clip-webvid/blob/main/Notebooks/GSI_VideoRetrieval_VideoEmbedding.ipynb) is available that provides instructions on how to perform video embedding.
57
 
 
58
  ## Model Intended Use
59
 
60
  This model is intended for use in large scale video-text retrieval applications.
@@ -62,6 +56,13 @@ This model is intended for use in large scale video-text retrieval applications.
62
  To illustrate its functionality, refer to the accompanying [**Video Search Space**](https://huggingface.co/spaces/Diangle/Clip4Clip-webvid) which provides a search demonstration on a vast collection of approximately 1.5 million videos.
63
  This interactive demo showcases the model's capability to effectively retrieve videos based on text queries, highlighting its potential for handling substantial video datasets.
64
 
 
 
 
 
 
 
 
65
 
66
  ## Evaluations
67
 
@@ -93,6 +94,7 @@ For an elaborate description of the evaluation refer to the notebook
93
 
94
  ## Acknowledgements
95
  Acknowledging Diana Mazenko of [Searchium](https://www.searchium.ai) for adapting and loading the model to Hugging Face, and for creating a Hugging Face [**SPACE**](https://huggingface.co/spaces/Diangle/Clip4Clip-webvid) for a large-scale video-search demo.
 
96
  Acknowledgments also to Lou et el for their comprehensive work on CLIP4Clip and openly available code.
97
 
98
  ## Citations
 
20
 
21
  This HF model is based on the [clip-vit-base-patch32](https://huggingface.co/openai/clip-vit-base-patch32) architecture, with weights trained by Daphna Idelson at [Searchium](https://www.searchium.ai).
22
 
 
 
 
 
 
 
 
23
 
24
  # How to use
25
  ### Extracting Text Embeddings:
 
48
 
49
  An additional [notebook](https://huggingface.co/Diangle/clip4clip-webvid/blob/main/Notebooks/GSI_VideoRetrieval_VideoEmbedding.ipynb) is available that provides instructions on how to perform video embedding.
50
 
51
+
52
  ## Model Intended Use
53
 
54
  This model is intended for use in large scale video-text retrieval applications.
 
56
  To illustrate its functionality, refer to the accompanying [**Video Search Space**](https://huggingface.co/spaces/Diangle/Clip4Clip-webvid) which provides a search demonstration on a vast collection of approximately 1.5 million videos.
57
  This interactive demo showcases the model's capability to effectively retrieve videos based on text queries, highlighting its potential for handling substantial video datasets.
58
 
59
+ ## Motivation
60
+
61
+ As per the original authors, the main motivation behind this work is to leverage the power of the CLIP image-language pre-training model and apply it to learning
62
+ visual-temporal concepts from videos, thereby improving video-based searches.
63
+
64
+ By using the WebVid dataset, the model's capabilities were enhanced even beyond those described in the paper, thanks to the large-scale and diverse nature of the dataset empowering the model's performance.
65
+
66
 
67
  ## Evaluations
68
 
 
94
 
95
  ## Acknowledgements
96
  Acknowledging Diana Mazenko of [Searchium](https://www.searchium.ai) for adapting and loading the model to Hugging Face, and for creating a Hugging Face [**SPACE**](https://huggingface.co/spaces/Diangle/Clip4Clip-webvid) for a large-scale video-search demo.
97
+
98
  Acknowledgments also to Lou et el for their comprehensive work on CLIP4Clip and openly available code.
99
 
100
  ## Citations