Better or comparable results with MPT, Falcon, LLaMA, OpenLLaMA in text & code tasks.
On standard NLP benchmarks, XGen achieves comparable or better results when compared with state-of-the-art open-source LLMs (e.g. MPT, Falcon, LLaMA, Redpajama, OpenLLaMA) of similar model size.
Our targeted evaluation on long sequence modeling benchmarks show benefits of our 8K-seq models over 2K- and 4K-seq models.
XGen-7B archives equally strong results both in text (e.g., MMLU, QA) and code (HumanEval) tasks.