Output is nonsensical

#22

by matthew-at-qamcom - opened Aug 8

Aug 8

First, thanks for posting your model and for providing instructions. However, I have tried to follow your instructions and I get only garbled, nonsensical results. Would you have any suggestions?

I have copy-pasted your instructions:

import torch
from transformers import pipeline

summarizer = pipeline(
    "summarization",
    "pszemraj/long-t5-tglobal-base-16384-book-summary",
    device=0 if torch.cuda.is_available() else -1,
)
long_text = "Here is a lot of text I don't want to read. Replace me"

result = summarizer(long_text)
print(result[0]["summary_text"])

And, for example, I get the output: " and conquer point eventually jump.

The full output is:

/home/matthew/venv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
Your max_length is set to 512, but your input_length is only 18. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=9)
/home/matthew/venv/lib/python3.10/site-packages/transformers/modeling_utils.py:1055: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
 " and conquer point eventually jump

I have torch version 2.4.0 and transformers version 4.44.0.

Any suggestions would be greatly appreciated.

Thanks in advance,

Matthew

matthew-at-qamcom

Aug 8

If I keep your Python instructions but replace your model with another, I get reasonable output:

import torch
from transformers import pipeline

summarizer = pipeline(
    "summarization",
    #"pszemraj/long-t5-tglobal-base-16384-book-summary",                                                                                                                                                    
    "stevhliu/my_awesome_billsum_model",
    device=0 if torch.cuda.is_available() else -1,
)
long_text = "Here is a lot of text I don't want to read. Replace me"

result = summarizer(long_text)
print(result[0]["summary_text"])

Gives:

Here is a lot of text I don't want to read. Replace me with a text that I'm not sure I'd like to use.

It's not a great summary, but the fact that it "works" indicates to me that there might be an issue with pszemraj/long-t5-tglobal-base-16384-book-summary. Is there a way I can validate that the model I downloaded isn't somehow corrupt?

Thanks again,

Matthew

pszemraj

Owner Aug 9

•

edited Aug 9

hi! so the output text did not use to be like that at the time of creation of this model. I think there is an issue with the Long-T5 modeling code in transformers in general. I would recommend trying with the following package versions:

pip install -q transformers==4.31.0
pip install accelerate==0.20.3 -q
pip install sentencepiece  -q

This what I have set for the colab for my XL model based on this arch. I haven't had time to investigate this/raise an issue in the transformers repo, but rolling back to earlier versions should help

edit:

Yep, that seems to be the case. check out this google colab notebook where I installed those versions, the output is reasonable with "This is a very long list of text I do not want to read, replace me with you. Replace me."

matthew-at-qamcom

Aug 9

Thanks for your very quick response. I was pleased to see that your XL model with the latest version of transformers performs as expected.

Thanks again,

Matthew

pszemraj

Owner Aug 9

happy to help and thanks for flagging this issue! I'll add some notes to the model card(s)

pszemraj changed discussion status to closed Aug 9

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment