different number of layers in the openclip text encoder

#20
by kamwoh - opened

i realize the number of layers of text encoder in SD2.1 is different from the text encoder in laion/CLIP-ViT-H-14-laion2B-s32B-b79K.

in SD2.1, the number of layers is 23 while in laion/CLIP-ViT-H-14-laion2B-s32B-b79K it is 24. Just wondering if there is any impact on the generation quality/text understanding.

i realize the number of layers of text encoder in SD2.1 is different from the text encoder in laion/CLIP-ViT-H-14-laion2B-s32B-b79K.

in SD2.1, the number of layers is 23 while in laion/CLIP-ViT-H-14-laion2B-s32B-b79K it is 24. Just wondering if there is any impact on the generation quality/text understanding.

According to your comment, laion/CLIP-ViT-H-14-laion2B-s32B-b79K have one more layer than SD 2.1. Is the additional layer a projection later to just modify the output dimension? Thank you! :)

Sign up or log in to comment