why vivit in huggingface has no factorized encoders and so on?
#4
by
tsaganshosg
- opened
Dear friends,
i find that the torch implementation here has no "factorized encoder", "factorized self-attention", ...
the implementation of patchifying here is just a simple "Joint Space-Time"
thank you!