π© Report
The model is bugged and is not responding okay. Here's an example:
Input:
"LongT5 model is an encoder-decoder transformer pre-trained in a text-to-text denoising generative setting (Pegasus-like generation pre-training). LongT5 model is an extension of T5 model, and it enables using one of the two different efficient attention mechanisms - (1) Local attention, or (2) Transient-Global attention. The usage of attention sparsity patterns allows the model to efficiently handle input sequence.
LongT5 is particularly effective when fine-tuned for text generation (summarization, question answering) which requires handling long input sequences (up to 16,384 tokens)."
Output:
"matematic matematic orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid"
The same happens with other examples.
I am also having a similar issue. On both fine-tuned and base versions of this model, I get output on a simple summarization task like the following:
" informal informal Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt"
" the a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a"
I am also having a similar issue. On both fine-tuned and base versions of this model, I get output on a simple summarization task like the following:
" informal informal Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt"
" the a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a"
Which precision are you running the model at? I have noticed some issues when trying to load it in 16-bit, maybe that might be an issue? Try to run it with torch_dtype=torch.float32
or just omit the dtype argument when loading the model.