Model not working

#3
by bishmdl - opened

I tried to use the hosted inference API for Align model, but it is not working. Error message received: The model_type 'align' is not recognized. It could be a bleeding edge model, or incorrect.

I have tried to import AlignModel and AlignProcessor from the transformers library, and i get Import Error there as well. There seems to be some error in the model. Any updates/help will be highly appreciated!

Kakao Brain org

@bishmdl

The transformers library with the ALIGN model is not yet released and is only on the main branch,
so to use the current ALIGN model, do pip install git+https://github.com/huggingface/transformers
command to install and use transformers.


from transformers import AlignProcessor, AlignModel

processor = AlignProcessor.from_pretrained("kakaobrain/align-base")
model = AlignModel.from_pretrained("kakaobrain/align-base")

""" 
Downloading (…)rocessor_config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 508/508 [00:00<00:00, 59.5kB/s]
Downloading (…)okenizer_config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 399/399 [00:00<00:00, 53.4kB/s]
Downloading (…)solve/main/vocab.txt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 232k/232k [00:00<00:00, 279kB/s]
Downloading (…)cial_tokens_map.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 125/125 [00:00<00:00, 49.1kB/s]
Downloading (…)lve/main/config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5.25k/5.25k [00:00<00:00, 660kB/s]
Downloading pytorch_model.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 690M/690M [00:31<00:00, 22.1MB/s]
"""

I think something is wrong with the weight initialization of the ALIGN model class. It shows me a warning to train all the layers since they are not initialized!

How to fine-tune this model?

@Sersh You can fine-tune this model in the same way you fine-tune other PyTorch models. Here's one way you can do it:

import torch
import torch.nn as nn
from transformers import AlignModel

class AlignClassifier(nn.module):
    def __init__(self, num_classes):
        super(AlignClassifier, self).__init__()
        self.model = AlignModel.from_pretrained("kakaobrain/align-base")
        # embedding size of Align Model is 640 for each modality.
        hidden_size = 640 + 640
        self.fc = nn.Linear(hidden_size, num_classes)

    def forward(self, **out):
        outputs = self.model(**out)
        image_embeds = outputs.image_embeds
        text_embeds = outputs.text_embeds
        # concatenate both embeddings
        embeds = torch.cat((image_embeds, text_embeds), dim=1)
        outputs = self.fc(embeds)
        return outputs

@rabiulawal If you use the from_pretrained method correctly, I think the model should have trained weights. Make sure you are not initializing using config as that will essentially give you only the architecture as per your configurations and the weights are randomly initialized.

model = AlignVisionModel.from_pretrained('/opt/licy/vms/align')
I used this code to load the model Why does it show a lot of missing parameters?

Some weights of AlignVisionModel were not initialized from the model checkpoint at /opt/licy/vms/align and are newly initialized: ['encoder.blocks.24.expansion.expand_bn.running_var', 'encoder.blocks.33.projection.project_bn.running_mean', 'encoder.blocks.12.depthwise_conv.depthwise_norm.num_batches_tracked', 'encoder.blocks.36.projection.project_bn.bias', 'encoder.blocks.40.squeeze_excite.expand.weight', 'encoder.blocks.23.depthwise_conv.depthwise_conv.weight', 'encoder.blocks.38.expansion.expand_bn.bias', 'encoder.blocks.49.squeeze_excite.reduce.bias', 'encoder.blocks.6.expansion.expand_bn.bias', 'encoder.blocks.40.expansion.expand_bn.bias', 'encoder.blocks.44.depthwise_conv.depthwise_norm.running_mean', 'encoder.blocks.44.depthwise_conv.depthwise_norm.weight', 'encoder.blocks.2.projection.project_bn.running_mean', 'encoder.blocks.43.projection.project_bn.running_var', 'encoder.blocks.53.expansion.expand_bn.running_var', 'encoder.blocks.54.depthwise_conv.depthwise_norm.bias', 'encoder.blocks.11.squeeze_excite.reduce.bias', 'encoder.blocks.35.depthwise_conv.depthwise_norm.num_batches_tracked', 'encoder.blocks.49.depthwise_conv.depthwise_conv.weight', 'encoder.blocks.49.depthwise_conv.depthwise_norm.running_var', 'encoder.blocks.53.expansion.expand_bn.weight', 'encoder.blocks.27.projection.project_bn.weight', 'encoder.blocks.6.depthwise_conv.depthwise_norm.num_batches_tracked', 'encoder.blocks.40.depthwise_conv.depthwise_norm.weight', 'encoder.blocks.18.projection.project_bn.running_var', 'encoder.blocks.9.expansion.expand_bn.running_var', 'encoder.blocks.32.squeeze_excite.expand.weight', 'encoder.blocks.40.squeeze_excite.reduce.weight', 'encoder.blocks.42.projection.project_bn.bias', 'encoder.blocks.52.projection.project_conv.weight', 'encoder.blocks.3.depthwise_conv.depthwise_norm.bias', 'encoder.blocks.0.depthwise_conv.depthwise_norm.running_var', 'encoder.blocks.27.projection.project_bn.num_batches_tracked', 'encoder.blocks.35.depthwise_conv.depthwise_norm.weight', 'encoder.blocks.15.expansion.expand_bn.bias', 'encoder.blocks.44.expansion.expand_bn.num_batches_tracked', 'encoder.blocks.3.depthwise_conv.depthwise_norm.weight', 'encoder.blocks.7.expansion.expand_bn.running_var', 'encoder.blocks.10.projection.project_bn.bias', 'encoder.blocks.52.depthwise_conv.depthwise_norm.running_mean', 'encoder.blocks.17.expansion.expand_conv.weight', ...........................

Sign up or log in to comment