[Cache Request] Helsinki-NLP/opus-mt-en-de

by k10 - opened

Please add the following model to the neuron cache

AWS Inferentia and Trainium org

Inference cache is only supported for causal lm models. cc @Jingya

AWS Inferentia and Trainium org

Hi @k10 , marian type models are not yet supported by optimum-neuron. To add its cache, we will need to add the export and inference support for it first.

I opened a ticket here, feel free to pick the task up if you want to contribute!

Sign up or log in to comment