A new idea to improve training and inference performance

#82
by lijip26313 - opened

Hello Google Team, I have an idea to significantly improve LLM performance: https://www.kaggle.com/code/vasilypodorov/fast-language-modelling-with-un-formers

There I have trained 0.7B parameter LLM of a new architecture with a throughput of approximately 0.7B tokens per hour on TPU v3-8. The details are in the article referenced above.

Could you read it and say if it makes sense? I would like you to try training small LLM based on this technique to decide whether it is useful or not. This would take just several TPU days.

Sign up or log in to comment