why vivit in huggingface has no factorized encoders and so on?

#4
by tsaganshosg - opened

Dear friends,
i find that the torch implementation here has no "factorized encoder", "factorized self-attention", ...
the implementation of patchifying here is just a simple "Joint Space-Time"
thank you!

Sign up or log in to comment