[Cache Request] google/flan-ul2

by gaussfer - opened

Please add the following model to the neuron cache

AWS Inferentia and Trainium org

Hi @gaussfer , as discussed with you on the optimum-neuron repo. We will need to add tensor parallel support first. Unfortunately, the tp support for t5 is not working for some tracing issue. AWS's side is working on solving it, optimum team will prioritize the t5 tp support as well once it's fixed.

Related: https://github.com/huggingface/optimum-neuron/issues/479

