mozilla-foundation
/

youtube_video_similarity_model_wt

Model card Files Files and versions Community

aapot commited on Sep 20, 2022

Commit

e8619a2

•

1 Parent(s): 45b39ff

Update README

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -29,6 +29,8 @@ For pretrained cross-encoders, [mmarco-mMiniLMv2-L12-H384-v1](https://huggingfac
 **Note**: sometimes YouTube videos lack transcripts so actually there are two different versions of this model trained: a model with trascripts (WT = with transcripts) and a model without transcripts (NT = no transcripts). This model is with transcripts and the model without transcripts is available [here](https://huggingface.co/mozilla-foundation/youtube_video_similarity_model_nt).
 ## Intended uses & limitations
 This model is intended to be used for analyzing whether a pair of YouTube videos are similar or not. We hope that this model will prove valuable to other researchers investigating YouTube.

 **Note**: sometimes YouTube videos lack transcripts so actually there are two different versions of this model trained: a model with trascripts (WT = with transcripts) and a model without transcripts (NT = no transcripts). This model is with transcripts and the model without transcripts is available [here](https://huggingface.co/mozilla-foundation/youtube_video_similarity_model_nt).
+**Note**: Possible model architecture enhancements are discussed a bit on [this blog post](https://foundation.mozilla.org/en/blog/the-regretsreporter-user-controls-study-machine-learning-to-measure-semantic-similarity-of-youtube-videos/) and some of the ideas were implemented and tried on experimental v2 version of the model which code is available on the RegretsReporter [GitHub repository](https://github.com/mozilla-extensions/regrets-reporter/tree/main/analysis/semsim). Based on the test set evaluation, the experimental v2 model didn't significantly improve the results. Thus, it was decided that more complex v2 model weights are not released at this time.
 ## Intended uses & limitations
 This model is intended to be used for analyzing whether a pair of YouTube videos are similar or not. We hope that this model will prove valuable to other researchers investigating YouTube.