hidden_states dimensionality

#3
by Janni - opened

Hey, im playing around with your model and trying to figure out if i can use the hidden_states for semantic search.

Can you explain to me why for an empty input, the dimensionality of the hidden_state is torch.Size([25, 1, 2, 1024])?
As far as i can see the encoder Robertaencoder has 24*RobertaLayer, where is the 25 coming from?
Shouldnt the dimensionality be num_hidden_layers * 1 * tokens * hidden_size?

Janni changed discussion title from hidden_states to hidden_states dimensionality

Hey,

Thank you for using this model! Could you please provide a code snippet, so I can know what you are trying to do?

Sure changed config to "output_hidden_states:true"

from transformers import (
TokenClassificationPipeline,
AutoModelForTokenClassification,
AutoTokenizer,
)

Define keyphrase extraction pipeline

class KeyphraseExtractionPipeline(TokenClassificationPipeline):
def init(self, model, *args, **kwargs):
super().init(
model=AutoModelForTokenClassification.from_pretrained(model_path),
tokenizer=AutoTokenizer.from_pretrained(model),
*args,
**kwargs
)

def _forward(self, model_inputs):
    # Forward
    special_tokens_mask = model_inputs.pop("special_tokens_mask")
    offset_mapping = model_inputs.pop("offset_mapping", None)
    sentence = model_inputs.pop("sentence")
    outputs = self.model(**model_inputs)
    logits = outputs[0]

    # I am talking abouth these outputs, they now contain ["hidden_state"]
    embedding = get_embeddings(outputs[1])

    return {
        "logits": logits,
        "special_tokens_mask": special_tokens_mask,
        "offset_mapping": offset_mapping,
        "sentence": sentence,
        "hidden_state": outputs[1],
        "embedding": embedding,
        **model_inputs,
    }


def postprocess(self, model_outputs):
    results = super().postprocess(
        model_outputs=model_outputs,
        aggregation_strategy=AggregationStrategy.SIMPLE,
    )
    return {**model_outputs, **{"keywords": np.unique([result.get("word").strip() for result in results]).tolist()}}

model_path = "keyphrase-extraction-kbir-inspec"
extractor = KeyphraseExtractionPipeline(model=model_path)

Hey @Janni ,

There is a reason why there are 25 layers. The output of the embeddings is also included. So the hidden layers exist out of the output of each layer + the output of the embeddings.

You can find more information in this GitHub thread: https://github.com/huggingface/transformers/issues/1332.
You can also find this in the HuggingFace Roberta documentation.

Hope it helps!

Kind regards,
Thomas De Decker

DeDeckerThomas changed discussion status to closed

Sign up or log in to comment