A question and request

#1
by HR1777 - opened

I have noticed that you develop very interesting models. I have been searching for a model that can handle long inputs, such as those with more than 3,000 words, and then generate a comprehensive article based on that input. However, I have not been able to find any model that can handle such long inputs so far.

I have tried various models with maximum token lengths of 64K, 128K, and even 200K. Unfortunately, it seems that all of these models either forget parts of the input text or fail to produce adequately lengthy articles.

  1. I would like to inquire about the capabilities of your models. Can they handle very long inputs containing more than 3,000 words and subsequently generate a thorough article based on that input?
  2. Additionally, is it feasible for you to develop a model with such capabilities?

Thank you for your time, and I look forward to your response.

Hey thanks for the questions and interest in the models. This one does have a 200k context window, but hallucinations are a part of LLMs for now.

However, the concept of extractive summary can be implemented through a custom logits processor outside of the model.

This will allow any text generation model to summarize large documents using only tokens found in the original document. You can then implement a scale from 0-1 that will increase or decrease the likelihood of the tokens from the original document being present in the response.

I hope this helps. I would absolutely give this model a try to see if it’s helpful to you. I think it’s particularly fun to experiment with

However, the concept of extractive summary can be implemented through a custom logits processor outside of the model.

This will allow any text generation model to summarize large documents using only tokens found in the original document. You can then implement a scale from 0-1 that will increase or decrease the likelihood of the tokens from the original document being present in the response.

Thank you so much for your explanations. Could you please explain a little more about this method? What does scaling from 0 to 1 mean and how can i implement it using webui text generation?

I actually am not sure if text generation webui. I’ve only done it once using vanilla transformers.

I will say also that what I have described is not necessarily a hyper parameter, but an additional processor that you must create. It’s a very advanced concept that would need to be tailored to your specific use case.

Here’s a link to the feature I’m talking about. https://github.com/huggingface/transformers/pull/14779

Don’t get hung up on this too much. This is just one way to accomplish your task.

Another one is to use retrieval augmented generation so the model grabs the content from the documents explicitly when you prompt it and it recognizes the tokens.

This strategy is far easier to implement and would probably show better results

Another one is to use retrieval augmented generation so the model grabs the content from the documents explicitly when you prompt it and it recognizes the tokens.

Thank you so much for your detailed reply. Do you know any model based on retrieval augmented generation method that can handle long articles with about 3,000 words and subsequently generate a thorough article based on that input?

I would use something like this https://resources.nvidia.com/en-us-generative-ai-chatbot-workflow/retrieval-augmented-generation-explainer but have a higher context model like you have suggested. Many of them would be suitable for your task

Hey I just saw this and thought it would be helpful for you moving forward. @HR1777 https://read-agent.github.io/

Thank you so much for your help. I really appreciate it.

macadeliccc changed discussion status to closed

Sign up or log in to comment